R08. Appraisal

From the CTS application:
The repository accepts data and metadata based on defined criteria to ensure relevance and understandability for data users.

The Dataverse software supports active appraisal by:

  • Supporting workflows where depositors can create draft versions of datasets that collection support staff and third parties (such as data verification services) can review before the datasets are published. Collection support staff can establish such workflows by using the Dataverse software’s Submit for Review and Private URL features to help ensure that the repository publishes relevant and high quality data and metadata.
  • Requiring that dataset depositors complete metadata fields necessary for creating dataset citations that follow Force11’s data citation principles and for contacting depositors.
  • Providing metadata fields that are informed by widely-used metadata standards.
  • Helping collection support staff control which metadata fields are required for creating datasets (in addition to the five fields already required).
  • Helping collection support staff reject and remove data that doesn’t fit their collection development policies.
     

Answers from successful applicants

Tilburg University Dataverse collection:

Depositors are requested to follow the instructions on how to submit a data set. These instructions are available at: https://www.tilburguniversity.edu/dataverse-nl/

The depositors need to prepare a data report together with the data files. The data report should include all the necessary information about the data file and its production for third-party researchers to be able to replicate the study or to re-use the data. The template for the data report is available at: https://www.tilburguniversity.edu/dataverse-nl/

Depositors are requested to deliver their data in the preferred formats. As part of the deposit instructions, Tilburg University RDO has compiled a list of accepted data formats. This limited list is based on the most used data formats and existing format lists (e.g. that of DANS). The list of accepted data formats is available at https://www.tilburguniversity.edu/dataverse-nl/

Submitting other formats by depositors may be possible on request. The staff closely follows developments in the field of preservation in digital archiving to advise the data producer or author on the durability of different data formats.
 

QDR:

QDR prioritize the acquisition and curation of deposits according to our collection development and appraisal policy. Under that policy, and following its mission, QDR prioritizes “qualitative data, or data associated with mixed-method research with a strong qualitative component, that are generated and/or used in the social sciences or cognate disciplines” and/or that hold “great intellectual value and/or that are of high quality.” All data are reviewed by QDR’s Associate Director or Curation Specialist. As qualitative data archiving is a relatively new field, community norms are not yet well established or widely held. QDR largely follows the recommendations of the UK Data Archive for preparing qualitative data for archiving, and has also developed its own recommendations for data preparation and preferred formats.

QDR only requires depositors to complete a small number of metadata fields that provide basic bibliographic description of the data. However, QDR works closely with depositors to encourage and help them to provide in-depth documentation about the collection or generation and context of the data. Where metadata initially provided by depositors are too sparse to allow secondary users to make sense of the data (e.g., where no data collection methodology is described), QDR works with depositors to improve documentation. In line with the collection development policy, data that are found too lacking in documentation to be useful are not published. QDR curation staff, in collaboration with depositors, will convert detailed documentation into structured metadata. QDR's metadata application profile closely follows (a subset of) Data
Documentation Initiative (DDI) Codebook, the de-facto standard for social science metadata. To the extent possible, metadata categories are linked to more generic vocabularies, specifically Dublin Core and the DataCite Metadata Kernel.

QDR staff converts files that are sub-optimal formats into preferred file formats during curation where possible. Where no suitable format for archiving exists, QDR archives files as they are and commits to bit-level preservation. QDR staff proofreads and systematizes documentation provided by depositors to generate rich, standardized metadata (see R9 for more on file formats, conversion, and metadata).

Links:
Data preparation guidance: https://qdr.syr.edu/guidance/preparing-data
Recommended file formats: https://qdr.syr.edu/guidance/managing/formatting-data
Collection development and appraisal policy: https://qdr.syr.edu/policies/collectiondevelopment
Metadata application profile: https://qdr.syr.edu/policies/metadata
UK Data file format recommendations: https://www.ukdataservice.ac.uk/manage-data/format/recommended-formats
 

DataverseNO:

4 – The guideline has been fully implemented in the repository

DataverseNO is a Norwegian national, generic repository for open research data. The DataverseNO Accession Policy [1] explains what DataverseNO can accept for archiving. The DataverseNO Accession policy as well as the DataverseNO Deposit Guidelines [2] also include guidelines on how to select data for archiving.

Data accepted for archiving in DataverseNO are in digital formats, and they are either generated through the course of a research project and/or deposited with an expectation that public availability will allow the data to be used for research purposes. As a GENERIC repository, the collection development policy of DataverseNO does not put any limitations on the field of study represented in the data to be deposited. However, special collections within DataverseNO may in addition have requirements on the subject area of the research data to be deposited. Currently, TROLLing is the only special collection in DataverseNO. TROLLing only accepts research data from linguistics / language studies.

Although the mission of DataverseNO – with the possible exception of special collections – is to be a national generic repository for open research data the repository strives to provide subject-specific expertise as far as possible; see also R6, and R11. This is why, as a main rule, data deposited into institutional collections or into the top-level collection of DataverseNO are curated by Research Data Service staff who are subject specialists in addition to be trained in research data management. Special collections of DataverseNO are without exception managed and curated by permanent Research Data Service staff who are subject specialists.

After deposit, each dataset is curated by Research Data Service staff before publication, to ensure compliance with the DataverseNO Accession Policy [1] [3], and the DataverseNO Deposit Guidelines [2], regarding completeness, organization and documentation of the data. If necessary, Research Data Service staff communicate with depositors to make the dataset compliant with these policies and guidelines.

Depositors must make a selection or appraisal of which files to be deposited in order for the dataset to be complete and understandable. As a general rule, enough data must be provided for others to be able to understand and replicate the study or otherwise (re)use the deposited data. Decisions on data selection and completeness should preferably be based on general discussions in the institutional, national and international research communities about what is appropriate and what is considered good practice within the discipline in question. This approach is fully in line with the recommendations in the National policy for research data management in Norway, which states that questions regarding what data researchers should make openly available “are questions that researchers themselves have to decide on through discussions in the institutional, national and international research communities about what is appropriate and what is considered good practice within different subject areas” [4].

The DataverseNO Accession Policy requires depositors to provide enough data and metadata (included a ReadMe file) so that others can understand and (re)use the data. Our Deposit Guidelines describe in more detail how datasets have to be prepared and documented according to best practice before they are deposited to the repository. Datasets submitted to the repository are curated by Research Data staff before they are published. The curation process assures as far as possible that the deposited datasets are complete and understandable. Datasets not complying with these requirements are returned to the author together with requests to adjust and/or better describe or document the dataset in order to comply with our guidelines. The curation procedures are described in the DataverseNO Curator Guidelines [5].

The DataverseNO Accession Policy requires deposited datasets to be in (a) preferred file format(s) to facilitate long-term preservation. The DataverseNO Deposit Guidelines include a list of preferred file formats for common document types. Adherence to preferred file formats is part of the curation process, as described in the DataverseNO Curator Guidelines. File formats not included in the list, will be assessed during the curation process. Research Data Service staff closely follow best practice in the field of preservation in digital archiving in order to be able to advise depositors on the sustainability of different data formats.

As a main rule, DataverseNO requires data to be deposited in their original file format in addition to a preferred file format (if the original is not in a preferred format), as described in the DataverseNO Deposit Guidelines. If data are deposited in non-preferred file formats only, the dataset is returned to the depositor together with a request to provide the data in preferred file formats as well. The DataverseNO Deposit Guidelines also give advice on how to convert data files from non-preferred file formats into preferred file formats. However, if the research data are represented in a non-preferred file format that is commonly used by the research community at stake, and the file format cannot be converted into a preferred format, DataverseNO accepts the data for deposit with the limitations this implies for long-term preservation; see R10.

If – after the curation process – the depositor is not able to provide data that are sufficiently complete and sufficiently documented they cannot be published in DataverseNO. For data that have been accepted and published, the DataverseNO Deposit Agreement grants DataverseNO the right to amend the metadata as well as convert and migrate data files to any medium or format for the purposes of (long-term) preservation [6]. The measures for long-term preservation of datasets published in DataverseNO are described in R10. In case the metadata provided in a published dataset at a later stage nevertheless turn out to be insufficient for long-term preservation Research Data Service staff responsible for the curation of the dataset in question will attempt to obtain more information about the dataset from the depositor in order to update the preservation metadata about the dataset. If this information cannot be obtained from the depositor Research Data Service staff will ask for expertise help from the Designated Community and the experts described in R6.

References:
[1] DataverseNO Accession Policy: https://site.uit.no/dataverseno/about/policy-framework/accession-policy/
[2] DataverseNO Deposit Guidelines: https://site.uit.no/dataverseno/deposit/
[3] DataverseNO Policy Framework and Definitions: https://site.uit.no/dataverseno/about/policy-framework/ (see section “Quality Commitment”)
[4] National policy for research data management in Norway (12/2017): https://www.regjeringen.no/contentassets/3a0ceeaa1c9b4611a1b86fc5616abde7/no/pdf/f-4442-b-nasjonal-strategi.pdf (p.26, Norwegian only; English translation given in answer to R0 above)
[5] DataverseNO Curator Guidelines: https://site.uit.no/dataverseno/admin-en/curatorguide/
[6] DataverseNO Deposit Agreement: https://site.uit.no/dataverseno/about/policy-framework/deposit-agreement/