#  Trusted Data: Tech Notes 

 



 ##  

  expand\_more  

 
  

 

## November 2025

We are approaching automated trust by building three new mechanisms on the Dataverse platform.   
  
Three Mechanisms for Trust:

1. Data Collection and Community Derived-Measures (Trusted Reviews)
2. Engagement: Voting
3. Automated Trust Indicators

  
We report the technical status as of November 2025 of the three mechanisms below:

### 1. Data Collection and Community-Derived Measures

Trusted Review relies on a flexible data‑collection mechanism that allows communities to plug in their own assessment workflows. Our update details three scenarios that we and our partners might pursue for capturing trust assessments:

- **External assessments (scenario 1):** A community panel of experts reviews resources using a separate process or third‑party tool (e.g., a Qualtrics survey). The results are tabulated and recorded in the metadata of individual Trusted Reviews. Scores can be entered manually or uploaded via API.
- **External tool integration (scenario 2):** Communities may use an external application, such as a Shiny app, to collect expert assessments, tabulate them, and write the final scores directly into Trusted Review metadata in a Dataverse collection.
- **On‑platform assessments (scenario 3):** Experts perform the assessment and scoring within the Dataverse platform. Reviewers record their assessments on Trusted Review drafts, and a collection manager records the final scores in the metadata. This scenario would require significant modifications to the Dataverse codebase and is still under investigation.

Scenario 1 is already supported; scenarios 2 and 3 will depend on new development work. Together, these options provide a flexible aggregation mechanism that communities can tailor to their needs while ensuring that final trust scores are stored consistently.

### 2. Engagement: Voting

To complement expert assessments, the team is exploring ways to let users directly contribute to trust scores through voting. During discussions with the Google project team, our use‑case partners expressed a desire for in‑platform voting support, which aligns with internal Dataverse goals.

A proposed design involves a simple user‑interface element, with a function similar to a “like” button, that allows users to vote on datasets or other objects. Each vote would be stored in a table with fields for the persistent identifier, object type, version, timestamp and vote value. Different vote types could capture distinct measures (e.g., metadata quality or geographic accuracy), and the Dataverse engine would calculate aggregated statistics and store them alongside the Trusted Data metadata. A ‘who‑can‑do‑what’ mechanism would control who may cast votes, ensuring that simple “likes” are available to all users while more specialised trust assignments remain restricted.

Voting support is still in the design phase, but it underscores the project’s commitment to user engagement. By allowing community members to vote, Dataverse can surface general perceptions of trustworthiness while complementing expert‑driven trust assessments.

### 3. Automated Trust Indicators

The most experimental of the three mechanisms, beyond manual assessments and user voting, the team is working toward **automated Trust Indicators** across all Dataverse datasets. Our idea is to report “light” versions of trust indicators based on information already available within the Harvard Dataverse repository, such as dataset owner, depositor, purpose, journal replication status, and other attributes readily available within the repository.

These automated labels will provide immediate, albeit preliminary, indicators of trust and encourage dataset owners to improve their metadata. The approach is designed to be flexible; dataset owners can easily transition from automated labels to community‑defined Trusted Reviews, and organisations can require Trust Reviews or automated Trust Indicators when appropriate. This automated layer will amplify visibility into dataset quality and create synergies with the community‑driven mechanism and user engagement mechanism.