Dataverse Software 5.13 Release

This release brings new features, enhancements, and bug fixes to the Dataverse software. Thank you to all of the community members who contributed code, suggestions, bug reports, and other assistance across the project.

https://github.com/IQSS/dataverse/releases/tag/v5.13

Release Highlights

Schema.org Improvements (Some Backward Incompatibility)

The Schema.org metadata used as an export format and also embedded in dataset pages has been updated to improve compliance with Schema.org's schema and Google's recommendations for Google Dataset Search.

Please be advised that these improvements have the chance to break integrations that rely on the old, less compliant structure. For details see the "backward incompatibility" section below. (Issue #7349)

Folder Uploads via Web UI (dvwebloader, S3 only)

For installations using S3 for storage and with direct upload enabled, a new tool called DVWebloader can be enabled that allows web users to upload a folder with a hierarchy of files and subfolders while retaining the relative paths of files (similarly to how the DVUploader tool does it on the command line, but with the convenience of using the browser UI). See Folder Upload in the User Guide for details. (PR #9096)

Long Descriptions of Collections (Dataverses) are Now Truncated

Like datasets, long descriptions of collections (dataverses) are now truncated by default but can be expanded with a "read full description" button. (PR #9222)

License Sorting

Licenses as shown in the dropdown in UI can be now sorted by the superusers. See Sorting Licenses section of the Installation Guide for details. (PR #8697)

Metadata Field Production Location Now Repeatable, Facetable, and Enabled for Advanced Search

Depositors can now click the plus sign to enter multiple instances of the metadata field "Production Location" in the citation metadata block. Additionally this field now appears on the Advanced Search page and can be added to the list of search facets. (PR #9254)

Support for NetCDF and HDF5 Files

NetCDF and HDF5 files are now detected based on their content rather than just their file extension. Both "classic" NetCDF 3 files and more modern NetCDF 4 files are detected based on content. Detection for older HDF4 files is only done through the file extension ".hdf", as before.

For NetCDF and HDF5 files, an attempt will be made to extract metadata in NcML (XML) format and save it as an auxiliary file. There is a new NcML previewer available in the dataverse-previewers repo.

An extractNcml API endpoint has been added, especially for installations with existing NetCDF and HDF5 files. After upgrading, they can iterate through these files and try to extract an NcML file.

See the NetCDF and HDF5 section of the User Guide for details. (PR #9239)

Support for .eln Files (Electronic Laboratory Notebooks)

The .eln file format is used by Electronic Laboratory Notebooks as an exchange format for experimental protocols, results, sample descriptions, etc...

Improved Security for External Tools

External tools can now be configured to use signed URLs to access the Dataverse API as an alternative to API tokens. This eliminates the need for tools to have access to the user's API token in order to access draft or restricted datasets and datafiles. Signed URLs can be transferred via POST or via a callback when triggering a tool via GET. See Authorization Options in the External Tools documentation for details. (PR #9001)

Geospatial Search (API Only)

Geospatial search is supported via the Search API using two new parameters: geo_point and geo_radius.

The fields that are geospatially indexed are "West Longitude", "East Longitude", "North Latitude", and "South Latitude" from the "Geographic Bounding Box" field in the geospatial metadata block. (PR #8239)

Reproducibility and Code Execution with Binder

Binder has been added to the list of external tools that can be added to a Dataverse installation. From the dataset page, you can launch Binder, which spins up a computational environment in which you can explore the code and data in the dataset, or write new code, such as a Jupyter notebook. (PR #9341)

CodeMeta (Software) Metadata Support (Experimental)

Experimental support for research software metadata deposits has been added.

By adding a metadata block for CodeMeta, we take another step toward adding first class support of diverse FAIR objects, such as research software and computational workflows.

There is more work underway to make Dataverse installations around the world "research software ready."

Note: Like the metadata block for computational workflows before, CodeMeta is listed under Experimental Metadata in the guides. Experimental means it's brand new, opt-in, and might need future tweaking based on experience of usage in the field. We hope for feedback from installations on the new metadata block to optimize and lift it from the experimental stage. (PR #7877)

Mechanism Added for Stopping a Harvest in Progress

It is now possible for a sysadmin to stop a long-running harvesting job. See Harvesting Clients in the Admin Guide for more information. (PR #9187)

API Endpoint Listing Metadata Block Details has been Extended

The API endpoint /api/metadatablocks/{block_id} has been extended to include the following fields:

  • controlledVocabularyValues - All possible values for fields with a controlled vocabulary. For example, the values "Agricultural Sciences", "Arts and Humanities", etc. for the "Subject" field.
  • isControlledVocabulary: Whether or not this field has a controlled vocabulary.
  • multiple: Whether or not the field supports multiple values.

See Metadata Blocks in the API Guide for details. (PR #9213)

Advanced Database Settings

You can now enable advanced database connection pool configurations useful for debugging and monitoring as well as other settings. Of particular interest may be sslmode=require. See the new Database Persistence section of the Installation Guide for details. (PR #8915)

Support for Cleaning up Leftover Files in Dataset Storage

Experimental feature: the leftover files stored in the Dataset storage location that are not in the file list of that Dataset, but are named following the Dataverse technical convention for dataset files, can be removed with the new Cleanup Storage of a Dataset API endpoint.

OAI Server Bug Fixed

A bug introduced in 5.12 was preventing the Dataverse OAI server from serving incremental harvesting requests from clients. It was fixed in this release (PR #9316).

Major Use Cases and Infrastructure Enhancements

Changes and fixes in this release not already mentioned above include:

  • Administrators can configure an alternative storage location where files uploaded via the UI are temporarily stored during the transfer from client to server. (PR #8983, See also Configuration Guide)
  • To improve performance, Dataverse estimates download counts. This release includes an update that makes the estimate more accurate. (PR #8972)
  • Direct upload and out-of-band uploads can now be used to replace multiple files with one API call (complementing the prior ability to add multiple new files). (PR #9018)
  • A persistent identifier, CSRT, is added to the Related Publication field's ID Type child field. For datasets published with CSRT IDs, Dataverse will also include them in the datasets' Schema.org metadata exports. (Issue #8838)
  • Datasets that are part of linked dataverse collections will now be displayed in their linking dataverse collections.