Replication Dataset Guidelines

Many journals, publishers, and funding agencies require researchers to deposit replication datasets in a public repository. The Dataverse Project helps researchers fulfill this requirement by supporting the deposit of replication datasets, making this special type of data easily discoverable for other researchers to reuse and verify that a study can be replicated without having to contact the study's authors.

When setting up a replication dataset in a Dataverse repository here are some helpful guidelines to follow:

1. Be sure to include all the necessary descriptive metadata (description, methods, data sources, etc) that would make it easier for other researchers to discover your replication dataset. For example:

  • in the dataset form, click the "Replication Data for" button to add this text to the title, making it clear to other researchers that this dataset can be replicated;
  • in the Publication Citation fields, include a permanent link to the original publication(s) (e.g., journal article, dissertation, etc) based on the data. If this article is publicly available, you can also upload the txt/pdf with the files for this dataset.

2. When you are ready to upload your replication dataset files into a Dataverse repository, make sure you have:

  • a list of code, scripts, documents and data files that are needed in order to make replication possible;
  • include at least a subset(s) of the original dataset files, containing only those variables necessary for replication of published results;
  • deposit preferred or commonly-used file formats in your discipline, and remember to remove information from your datasets that must remain confidential;
  • sets of computer program recodes (if needed);
  • program commands, code or script for analysis (if needed);
  • extracts of existing publicly available data (or very clear directions for how to obtain exactly the same ones you used); and
  • documentation files (full set) which can include:
    • “readme” file (explanatory document on how to use the files to replicate the study),
    • text/pdf file of the article (if no subscription required),
    • list of links to software (or deposit the actual software used to replicate the data),
    • codebook,
    • data collection instruments,
    • summary statistics,
    • project summaries, and
    • bibliographies of publications pertaining to the data.

To learn more about replication in the social sciences:

King, Gary. "Replication, Replication" (Article: PDF) in PS: Political Science and Politics, with comments from nineteen authors and a response, "A Revised Proposal, Proposal," Vol. XXVIII, No. 3 (September, 1995): pp. 443-499. (Article: PDF ).