Harvard’s Subscription Data Dataverse


This post is from Sonia Barbosa, Manager of Data Curation, Harvard Dataverse. Get in touch at support@dataverse.org.

You can join Harvard’s Subscription Data Dataverse and share licensed data easily!

This year, Harvard’s Dataverse Repository decided it needed a long-term home for data licensed to Harvard affiliates. “Why now?” you may ask, after Dataverse has been in place and available for use since 2006? Let’s understand this new endeavor...

As the Manager of Data Curation for Harvard’s Dataverse repository, I have spent many, many years helping Harvard affiliates share their data on Harvard’s Dataverse. A little over a year ago, a former Harvard affiliate dropped off some data in my office with the request that the data be managed by our office. Those materials turned out to be licensed data from the Linguistic Data Consortium (https://www.ldc.upenn.edu/) that had been floating around Harvard University without a persistent collections manager. 

So here we were - with a few DVDs but not the entire collection of LDC data - that Harvard has paid for over the years, trying to make decisions on how best to support this collection, and I thought, “why not Dataverse?” 

Why not start with creating a space for data where Harvard affiliates can make use of our new authentication feature that allows users to log into Harvard’s Dataverse with their HarvardKey, and access data? 

Why not use Dataverse to share the data and also manage the subscription we have with LDC? I can track the number of users accessing the data files. I don’t need to validate that a user is Harvard-affiliated because they can only access the files using their HarvardKey. I don’t ever need to chase a DVD again.

In the middle of this process I was contacted by Widener Library with a need for space for their licensed data. At this time, Numeric Data Services was also sending their data inquiries to our support queue at IQSS while they waited for Shibboleth to be added to Harvard’s Dataverse. With consent from their team, we moved the Numeric Data Services Dataverse to the new licensed page and updated their permissions so that data access was now serviced by Shibboleth and we no longer had to validate a user’s affiliation. Not much later, we extended this invitation to the Harvard Business School. Widener, LDC, Numeric Data Services and Harvard Business School have managed to find a pretty awesome and reliable tool to share and manage the data licensed to Harvard. Given that Dataverse is a collaboration with Harvard Library, it makes sense that we should assist in making their licensed data accessible to the community.

We have been marketing the Licensed Dataverse space to as many groups at Harvard as possible, hoping they will take advantage of the space to share their data, noting the ease of use of Shibboleth as an authentication feature, and of course the long term storage and ease of use of Dataverse. This Dataverse also displays all of the  licensed data in one space for easy accessibility and discovery; if you log into Harvard Dataverse with your HarvardKey and navigate to the “my data” option found next to your username, you will have access to all of the licensed data that have been added to Harvard’s Subscription Dataverse, some of which you may not have known existed! This is what data discoverability is all about.

If you have yet to join the Licensed Data Dataverse team, don’t hesitate to contact me and get started!