Dataverse Software 5.9 Release

This release brings new features, enhancements, and bug fixes to the Dataverse Software. Thank you to all of the community members who contributed code, suggestions, bug reports, and other assistance across the project.


Release Highlights

Dataverse Collection Page Optimizations

The Dataverse Collection page, which also serves as the search page and the homepage in most Dataverse installations, has been optimized, with a specific focus on reducing the number of queries for each page load. These optimizations will be more noticable on Dataverse installations with higher traffic.

Support for HTTP "Range" Header for Partial File Downloads

Dataverse now supports the HTTP "Range" header, which allows users to download parts of a file. Here are some examples:

  • bytes=0-9 gets the first 10 bytes.
  • bytes=10-19 gets 10 bytes from the middle.
  • bytes=-10 gets the last 10 bytes.
  • bytes=9- gets all bytes except the first 10.

Only a single range is supported. For more information, see the Data Access API section of the API Guide.

Support for Optional External Metadata Validation Scripts

The Dataverse software now allows an installation administrator to provide custom scripts for additional metadata validation when datasets are being published and/or when Dataverse collections are being published or modified. The Harvard Dataverse Repository has been using this mechanism to combat content that violates our Terms of Use, specifically spam content. All the validation or verification logic is defined in these external scripts, thus making it possible for an installation to add checks custom-tailored to their needs.

Please note that only the metadata are subject to these validation checks. This does not check the content of any uploaded files.

For more information, see the Database Settings section of the Guide. The new settings are listed below, in the "New JVM Options and DB Settings" section of these release notes.

Displaying Author's Identifier as Link

In the dataset page's metadata tab the author's identifier is now displayed as a clickable link, which points to the profile page in the external service (ORCID, VIAF etc.) in cases where the identifier scheme provides a resolvable landing page. If the identifier does not match the expected scheme, a link is not shown.

Auxiliary File API Enhancements

This release includes updates to the Auxiliary File API. These updates include:

  • Auxiliary files can now also be associated with non-tabular files
  • Auxiliary files can now be deleted
  • Duplicate Auxiliary files can no longer be created
  • A new API has been added to list Auxiliary files by their origin
  • Some auxiliary were being saved with the wrong content type (MIME type) but now the user can supply the content type on upload, overriding the type that would otherwise be assigned
  • Improved error reporting
  • A bugfix involving checksums for Auxiliary files

Please note that the Auxiliary files feature is experimental and is designed to support integration with tools from the OpenDP Project. If the API endpoints are not needed they can be blocked.

Major Use Cases and Infrastructure Enhancements

Newly-supported major use cases in this release include:

  • The Dataverse collection page has been optimized, resulting in quicker load times on one of the most common pages in the application (Issue #7804, PR #8143)
  • Users will now be able to specify a certain byte range in their downloads via API, allowing for downloads of file parts. (Issue #6397, PR #8087)
  • A Dataverse installation administrator can now set up metadata validation for datasets and Dataverse collections, allowing for publish-time and create-time checks for all content. (Issue #8155, PR #8245)
  • Users will be provided with clickable links to authors' ORCIDs and other IDs in the dataset metadata (Issue #7978, PR #7979)
  • Users will now be able to associate Auxiliary files with non-tabular files (Issue #8235, PR #8237)
  • Users will no longer be able to create duplicate Auxiliary files (Issue #8235, PR #8237)
  • Users will be able to delete Auxiliary files (Issue #8235, PR #8237)
  • Users can retrieve a list of Auxiliary files based on their origin (Issue #8235, PR #8237)
  • Users will be able to supply the content type of Auxiliary files on upload (Issue #8241, PR #8282)
  • The indexing process has been updated so that datasets with fewer files and indexed first, resulting in fewer failures and making it easier to identify problematically-large datasets. (Issue #8097, PR #8152)
  • Users will no longer be able to create metadata records with problematic special characters, which would later require Dataverse installation administrator intervention and a database change (Issue #8018, PR #8242)
  • The Dataverse software will now appropriately recognize files with the .geojson extension as GeoJSON files rather than "unknown" (Issue #8261, PR #8262)
  • A Dataverse installation administrator can now retrieve more information about role deletion from the ActionLogRecord (Issue #2912, PR #8211)
  • Users will be able to use a new role to allow a user to respond to file download requests without also giving them the power to manage the dataset (Issue #8109, PR #8174)
  • Users will no longer be forced to update their passwords when moving from Dataverse 3.x to Dataverse 4.x (PR #7916)
  • Improved accessibility of buttons on the Dataset and File pages (Issue #8247, PR #8257)