R04. Confidentiality/Ethics

From the CTS application:
The repository ensures, to the extent possible, that data are created, curated, accessed, and used in compliance with disciplinary and ethical norms.

Depositors can restrict files and grant access to specific people or groups of people. Access can be granted to each file, to all files in a dataset, and to all files in all datasets within a Dataverse collection.

The Dataverse community encourages complete and open sharing of data, by default applying CC0 waivers to deposited datasets, but depositors can edit their datasets’ license metadata to inform others about any conditions for accessing and re-using the data.

Procedures for managing data with disclosure risks can include:

  • Deaccessioning specific versions of a dataset or all versions
  • Removing (and restoring) depositors’ edit access to datasets
  • Removing (and restoring) download access to specific files or all files in a dataset

The Dataverse software requires contact information from dataset depositors, which collection support staff and others can use to contact depositors whose data may have disclosure risks.

Answers from successful applicants

Tilburg University Dataverse collection:

As stated in its Research Data Management Regulations, Tilburg University and the university researchers comply with the relevant codes of conduct and the regulations that contain standards and best practices regarding, among other things, research data, in particular

A user who wants to access and use any stored Tilburg University’s research data must agree to the conditions, specified per study for the use of the data and other research material. Research data may only be made available to third parties to the extent compatible with the ownership of the data, applicable legal provisions, or codes of conduct (e.g., the Personal Data Protection Act (Wet Bescherming Persoonsgegevens http://wetten.overheid.nl/BWBR0011468), the Code of Conduct for the Use of Personal Data in Scientific Research (Gedragscode voor het gebruik van persoonsgegevens in wetenschappelijk onderzoek http://www.vsnu.nl/files/documenten/Domeinen/Accountability/Codes/Gedragscode%20persoonsgegevens.pdf), or any other obligation, e.g., of secrecy, with respect to the research data.

The new European regulations about data protection (General Data Protection Regulation, GDPR) are taken into account.

The General Terms of Use of Dataverse do not allow submission of any confidential or secret information. During the quality check of the data package, the Data Curator checks if any such information is in the data package.

See also:
Ethical research and data practices are of concern to all researchers. They can be of particular concern to qualitative researchers who have long-established relationships of trust with research participants. As part of QDR’s curation protocol, for all data projects that include data gathered from human participants (such as interview transcripts), QDR requests and reviews IRB/ethics board approval and the informed consent language used during the research to help the depositor evaluate if the sharing of data is precluded, and/or to aid the depositor in respecting any limits on the sharing of data that result from guarantees made to project participants.

Where deposited data are de-identified, curators review all documents to help depositors decide whether de-identification has been carried out properly. Where data and related documentation are in a language in which no member of the QDR staff is proficient, curation staff uses automated translation to spot check de-identification and conveys best de-identification practices to depositors. In all cases, QDR’s role is advisory. The final responsibility for decisions concerning de-identification remains with depositors.

During initial consultation with depositors, QDR staff also help the depositor to assess the sensitivity of data and potential disclosure risks and aids the depositor in identifying appropriate levels of access controls ranging from access for all registered users to access only on-site. Details of available access conditions are described in the documentation of access conditions.

For sensitive data, QDR follows strict protocols for transmitting, handling, and storing data. Depositors are instructed to encrypt data using AES-256 encryption prior to transfer, using SFTP or a Dropbox business folder with multi-factor authentication enforced for all users able to access content. Sensitive data are stored using AES 256 encryption.

Where the depositor requests additional safeguards for sensitive data, we help them to decide which access conditions should be imposed so that the data can be downloaded, The data are then distributed under a Special Download Agreement reflecting those access conditions. The conditions specified in the agreement reflect the nature of the disclosure risk in the data and can contain, for example, requirements for IRB/ethics board approval and/or a data security plan.

Sensitive data requires responsible use. QDR ensures that data is only released to personally identified individuals: access to data is granted following authentication via institutional e-mail and videoconferencing.

Additional requirements for data use are specified in the special download agreement and follow both depositor requests and QDR's assessment of the identifiability and risk for human participants of the data in question. The general requirements, by level of sensitivity, are outlined in QDR's "Handling Sensitive Data" policy under "Access to Restricted Data". For low sensitivity restricted data, authentication, a research plan, and assurances to not distribute the data further are typically sufficient for access. For medium sensitivity data, QDR requires a detailed data security plan as well as IRB approval for the proposed research and, in addition to the depositor's signature, the signature of an authorized institutional representative on the special download agreement. For highly sensitive data, QDR only allows access in person in a monitored room and screens users' notes. (While QDR does have the capacity to provide such access, it does not currently hold any data it classifies as highly sensitive). QDR is continuously exploring additional means of certifying researchers for access to sensitive data and thus facilitating access. We expect to be participating as a pilot institution in ICPSR's "Research Passport" initiative (see working paper linked below) that will leverage cross-repository collaboration to certify researchers in handling sensitive data.

Through its Terms and Conditions as well as its Standard Download Agreement, QDR also requires that researchers agree to use data ethically for data not deemed sensitive. As outlined, this includes giving attribution when using the data, not re-publishing it without explicit consent, and not using it for commercial purposes.

Given these precautions, we expect any misuse of QDR's data to be rare. Should it occur, QDR’s Download Agreements stipulate a range of sanctions for violation of the agreements, including deletion of user accounts, contacting the QDR institutional representative at the user’s home institution (if that institution is a member of QDR) and the IRB at the user’s home institution, and in cases endangering human participants, reporting to the federal Office of Human Research Protection.

QDR limits access of QDR staff to sensitive data. All access is overseen by senior staff, who have trained (and published) on the handling of sensitive qualitative data and regularly attend international workshops and conferences in data science and management to remain informed of state-of-the art practices and technology.

DataverseNO is a repository for open research data – meaning that datasets must only contain unrestricted content with no private, confidential, or other legally protected information. DataverseNO may only make available content that is publicly distributable. This is part of the DataverseNO Deposit Agreement [1] that the depositor has agreed to before deposit, and the DataverseNO Accession Policy [2].

The depositor is solely responsible for the content deposited in DataverseNO, and shall not provide DataverseNO with any confidential or proprietary information that is required to be kept secret. By submitting content for deposing in DataverseNO, the depositor represents and warrants this to be in agreement with the General guidelines for research ethics, as well as subject-specific guidelines, from the Norwegian National Committees for Research Ethics [3] [4] [5]. DataverseNO may remove any content at any time if it does not comply with the DataverseNO Deposit Agreement.

Although the depositor is solely responsible for the content, Research Data Service staff will check and review deposited datasets before publishing (see requirements R0 Level of Curation Performed) [6]. This includes checking for compliance with legal and ethical requirements, as well as with more general requirements in the DataverseNO Deposit Agreement. Any doubt or question concerning the compliance with the requirements mentioned above will be discussed with the depositor to secure compliance with the DataverseNO policies and guidelines before a dataset can be published. The Research Data Service staff taking care of the data curation are trained in performing the task by highly competent staff from the library at UiT The Arctic University of Norway, who also provide training and give courses [7] in various aspects of research data management, including management of research data with personal / sensitive information.

All employees of UiT The Arctic University of Norway, and the DatavereNO partner institutions are covered by the Norwegian Public Administration Act, section 13 and have signed a confidentiality agreement [8], ensuring that no confidential or personal information from their work (including DataverseNO) is disclosed.

DataverseNO requires that depositors define a license (see R2) for their dataset at the time of deposit, and licensing information is displayed in the metadata for each dataset. When trying to download a dataset with any other license than the default CC0, the user will be presented the actual license and terms (preferably CC BY), and must accept the conditions before downloading. In the case of non-compliance with any access and use license other than CC0 (or equivalent), the use of the dataset must be terminated immediately at the initial demand by DataverseNO. If the use is not terminated, DataverseNO may bring action against the user (see R2).

