Dataverse Software 5.7 Release

This release brings new features, enhancements, and bug fixes to the Dataverse Software. Thank you to all of the community members who contributed code, suggestions, bug reports, and other assistance across the project. You can learn more and download the release on GitHub:

https://github.com/IQSS/dataverse/releases/tag/v5.7

Release Highlights

Experimental Support for External Vocabulary Services

Dataverse installations can now be configured to associate specific metadata fields with third-party vocabulary services to provide an easy way for users to select values from those vocabularies. The mapping involves use of external Javascripts. Two such scripts have been developed so far: one for vocabularies served via the SKOSMOS protocol and one allowing people to be identified via their ORCID. The guides contain info about the new :CVocConf setting used for configuration and additional information about this functionality. Scripts, examples, and additional documentation are available at the GDCC GitHub Repository.

Please watch the online presentationread the document with requirements and join the Dataverse Working Group on Ontologies and Controlled Vocabularies if you have some questions and want to contribute.

This functionality was initially developed by Data Archiving and Networked Services (DANS-KNAW), the Netherlands, and funded by SSHOC, "Social Sciences and Humanities Open Cloud". SSHOC has received funding from the European Union’s Horizon 2020 project call H2020-INFRAEOSC-04-2018, grant agreement #823782. It was further improved by the Global Dataverse Community Consortium (GDCC) and extended with the support of semantic search.

Curation Status Labels

A new :AllowedCurationLabels setting allows a sysadmins to define one or more sets of labels that can be applied to a draft Dataset version via the user interface or API to indicate the status of the dataset with respect to a defined curation process.

Labels are completely customizable (alphanumeric or spaces, up to 32 characters, e.g. "Author contacted", "Privacy Review", "Awaiting paper publication"). Superusers can select a specific set of labels, or disable this functionality per collection. Anyone who can publish a draft dataset (e.g. curators) can set/change/remove labels (from the set specified for the collection containing the dataset) via the user interface or via an API. The API also would allow external tools to search for, read and set labels on Datasets, providing an integration mechanism. Labels are visible on the Dataset page and in Dataverse collection listings/search results. Internally, the labels have no effect, and at publication, any existing label will be removed. A reporting API call allows admins to get a list of datasets and their curation statuses.

The Solr schema must be updated as part of installing the release of Dataverse containing this feature for it to work.

Major Use Cases

Newly-supported major use cases in this release include:

  • Administrators will be able to set up integrations with external vocabulary services, allowing for autocomplete-assisted metadata entry, metadata standardization, and better integration with other systems (Issue #7711, PR #7946)
  • Users viewing datasets in the root Dataverse collection will now see breadcrumbs that have have a link back to the root Dataverse collection (Issue #7527, PR #8078)
  • Users will be able to more easily differentiate between datasets and files through new iconography (Issue #7991, PR #8021)
  • Users retrieving large guestbooks over the API will experience fewer failures (Issue #8073, PR #8084)
  • Dataverse collection administrators can specify which language will be used when entering metadata for new Datasets in a collection, based on a list of languages specified by the Dataverse installation administrator (Issue #7388, PR #7958)
    • Users will see the language used for metadata entry indicated at the document or element level in metadata exports (Issue #7388, PR #7958)
    • Administrators will now be able to specify the language(s) of controlled vocabulary entries, in addition to the installation's default language (Issue #6751, PR #7959)
  • Administrators and curators can now receive notifications when a dataset is created (Issue #8069, PR #8070)
  • Administrators with large files in their installation can disable the automatic checksum verification process at publish time (Issue #8043, PR #8074)