About

The Project

Dataverse is an open source web application to share, preserve, cite, explore, and analyze research data. It facilitates making data available to others, and allows you to replicate others' work more easily. Researchers, data authors, publishers, data distributors, and affiliated institutions all receive academic credit and web visibility.

A Dataverse repository is the software installation, which then hosts multiple dataverses. Each dataverse contains datasets, and each dataset contains descriptive metadata and data files (including documentation and code that accompany the data). As an organizing method, dataverses may also contain other dataverses.

The Strategic Goals

The strategic goals of Dataverse guide our release roadmap, our collaborations with the community, and the services that we provide. Currently, our goals are to:

  • increase adoption (users, dataverses, datasets, installations, journals)
  • finish Dataverse 4 migration features
  • develop capability to handle Level 3 sensitive, large scale, and streaming data
  • expand data and metadata features for existing and new disciplines
  • expand archival and preservation features
  • increase contributions from the open-source development community
  • improve UX and UI
  • continue to increase the quality of the software

The Collaboration

The Institute for Quantitative Social Science (IQSS) collaborates with the Harvard University Library and Harvard University Information Technology organization to make the installation of the Harvard Dataverse openly available to researchers and data collectors worldwide from all disciplines, to deposit data. IQSS leads the development of the open source Dataverse software and, with the Open Data Assistance Program at Harvard (a collaboration with Harvard Library, the Office for Scholarly Communication and IQSS), provides user support. The Library Technology Services at HUIT provides hosting and backups support of the Harvard Dataverse.

The History

Dataverse software is being developed at Harvard's Institute for Quantitative Social Science (IQSS), along with many collaborators and contributors worldwide. Dataverse was built on our experience with our earlier Virtual Data Center (VDC) project, which spanned 1999-2006 as a collaboration between the Harvard-MIT Data Center (now part of IQSS) and the Harvard University Library. Precursors to the VDC date to 1987, comprising such entities as pre-web software to automatically transfer cataloging information by FTP to other sites across campus automatically at designated times, and before that to a stand-alone software guide to local data.

The Team

Principal Investigator: Gary King

CO-Principal Investigator: Mercè Crosas

Development Team:  Gustavo Durand (Technical Lead), Leonid Andreev, Stephen Kraffmiller, Phil Durbin, Raman Prasad, Danny Brooke (Project Manager)

UI/UX Team: Tania Schlatter, Michael Heppler, Derek Murphy

QA and Technical Support: Kevin Condon

Curation and Archival Team: Sonia Barbosa, Dwayne Liburd, Julian Gautier

A growing open-source community receives contributions from individuals and institutions around the world.

The Name

Special thanks to Ella Michelle King, who won the contest to name our project, and to Pitney Bowe and The Forbin Group, Inc. for trademark assistance.

The Funding

Funded by Harvard with additional support from the Alfred P. Sloan Foundation, National Science Foundation, National Institutes of Health, Helmsley Charitable Trust, IQSS's Henry A. Murray Research Archive, and many others.