Strategic Goals, Roadmap, and Releases

Strategic Goals

The Strategic Goals of the Dataverse Project are our highest-level guide.  These goals are to:

  1. increase adoption (users, dataverses, datasets, installations, journals)
  2. finish Dataverse 4 migration features
  3. develop capability to handle sensitive, large scale, and streaming data
  4. expand data and metadata features for existing and new disciplines
  5. expand archival and preservation features
  6. increase interoperability through implementation of standards
  7. increase contributions from the open-source development community
  8. improve UX and UI
  9. continue to increase the quality of the software

Throughout the year, we'll identify big steps that we can take to focus on one or more of these goals. These big steps are represented on our Roadmap. The Roadmap items that we're about to work on will be well defined, but those Roadmap items that are further out may just be big problems we know we need to solve in some way. Although we are committed to Roadmap items below, the timeframe of the items further out might vary slightly as critical issues, other priorities or dependencies rise.

Once we know what features and enhancements we'll add in order to honor the steps on the roadmap, we'll plan a Release. If the release text is hyperlinked, you can click on it to be taken to our task board to see the status of the release's tasks.

Q3 2017

  • Administrative Dashboard
  • AWS S3 Support
  • Support for Large Data

4.7.1. Dashboard (released July 14th)

Administrators will be able to manage installation users and superusers through a user interface. This also provides the base for further UI administrative functionality.
 

 

4.8 AWS S3 Support and Large Data Upload Integration (released September 26th)

Administrators will be able to run Dataverse on AWS S3, which is more cost effective than other AWS options. This also provides a new cloud-based option to run Dataverse for current and new installations.
 

Dataverse integration with the Data Capture Module, an optional component for deposition of large datasets (both large number of files and large files).  Specific support for large datasets includes client-side checksums, non-http uploads (currently supporting rsync over ssh) and preservation of in-place directory hierarchy. This will expand Dataverse to other disciplines and will allow the project to handle large scale data.

Q4 2017

  • Performance Enhancements
  • ORCID API Upgrade
  • Schema.org
  • Persistent IDs for Files

4.8.1 Performance Enhancements (released October 10th)

Datasets with large numbers of files will load much more quickly.

 

4.8.2 Docker Images, Dataset Locking Updates (released November 7th)

Curators may again edit datasets while "In Review." Experimental Docker images are now available.

 

4.8.3 - ORCID Login Updates (released November 17th)

An update of Dataverse's ORCID integration from v1.2 to v2.0. For more information about the enhancements included in v2.0, visit https://members.orcid.org/api/news/xsd-20-update.

 

4.8.4 - schema.org Support (released December 5th)

Additional support for machine readable metadata.

2018

  • Persistent Identifiers for Files
  • Support for Sensitive Data
  • TBD

4.9 - Persistent Identifiers for Files

In addition to persistent identifiers at the Dataset level, Dataverse will mint persistent identifiers at the file level.

5.0 Support for Sensitive Data and Data Provenance

By implementing DataTags file-level security and access requirements, integration with DataTags interview tool, and the PSI differential privacy tool, Dataverse will be able to support sensitive data. 

Integrating with a data provenance system will allow users to track of where data files and datasets came from and how they were modified. This expansion of the data and metadata features of Dataverse increases reproducibility. 

The Dataverse Roadmap for 2018 is not yet fully defined. These are some things that we're thinking about:

  • Support for Streaming Data
  • File Hierarchy Within Datasets
  • Large Data Support and HTTP Upload Support
  • Embargoed Datasets