U-M researchers building system to keep confidential data in trustworthy hands
ANN ARBOR—When news of the Cambridge Analytica scandal broke, 87 million Facebook users learned that their personal data was obtained and used by a British research firm to influence the outcome of the 2016 U.S. presidential election.
Many were left wondering how data accessed by established researchers could have been used inappropriately and without their knowledge.
A new report from a team led by University of Michigan research professor Margaret Levenstein describes a new process to safeguard the research community’s access to and management of confidential data.
“Virtually all human activity in the modern world creates digital traces. It is our responsibility to ensure that the resulting data is protected and managed responsibly,” said Levenstein, director of the Inter-university Consortium for Political and Social Research at the U-M Institute for Social Research.
The Researcher Credentialing for Restricted Data Access project, funded by the Alfred P. Sloan Foundation, aims to increase the willingness of other data research centers to share data and the ability of scientists to undertake new creative research projects with the least possible risk to privacy and confidentiality.
The project team plans to create what it calls a “System of Digital Identities of Access” that provides trusted researchers with a Researcher Passport to encourage all scientists and data centers to abide by the same rules. Data archives such as ICPSR or other data custodians can issue visas to passport holders so that they can access data. The visa will permit the passport holder access to particular datasets for specified periods of time.
The U.S. Census Bureau, ICPSR and others disseminate restricted data. They do that in different physical and computing environments. They also each have their own process for vetting researchers. The Researcher Passport addresses the vetting, training and identification of trusted researchers. It does not replace the physical computing environments that these repositories use, but the visas that are issued by repositories will indicate that computing environment, according to the research team.
A pilot of the new credentialing system will debut in fall 2018.
Why the world needs this
Levenstein said that scientific research helps the public better understand the modern world.
“However, we want data to be used in ethical, responsible and transparent ways,” she said. “We advocate making sensitive or private data available to trusted researchers in a secure computing environment, to researchers trained in ethical practices—and with that access governed by legal agreements.”
The researchers offer three recommendations to address inconsistencies the project team found with access to and management of restricted data:
- Language and process harmonization: Promote common terminology, standards, training and technologies to support interoperability across data archives and data custodians.
- Researcher Passport: Verify credentials of trusted researchers to be shared digitally with archives and data custodians and establish different levels of access and path of progression with increasing trust.
- Data access visa: Create a digital permissions system that archives and data custodians use to grant access and provide a record of access to restricted data.
Social scientists are ready
The research community is poised to take advantage of this system, said Johanna Davidson Bleckman, who is part of the project team. Bleckman said investigators, data producers, repositories and policymakers share the broad goals of data sharing and evidence-based decision making.
“But we all struggle to balance those goals with the need to protect privacy, navigate regulatory restrictions, and to ensure responsible data storage and management,” she said. “This credentialing system seeks to facilitate the development of community norms around the ‘who, what, where and how’ of ethical data sharing, storage and stewardship.”
Bleckman said that recent data breaches and disclosure of sensitive information from social media platforms and others are a recurring reminder of the need to safeguard data used by the research community.
“Our ability to understand who has what data and for what reason has been outpaced by rapid advancements in technology, increasing personal data sharing via the web and our shared understanding of privacy,” she said.
The white paper draws on analysis of the policies of 23 data repositories in the United States, Europe and Australia conducted by U-M School of Information doctoral student Allison Tyler.