Working Group Publishes New Draft Code for Researcher Access to Platform Data

Justin Hendrix / May 31, 2022
New code would enable independent researchers to scrutinize platform data safely

Nearly every list of recommendations on ‘what to do’ about problems attributed to social media in recent years has included some version of “provide access to data for independent researchers,” based on the idea that more scrutiny of how platforms work and their impacts on society will give policymakers– and the platforms themselves– the basis to make smart interventions driven by evidence. The European Union’s Digital Services Act (DSA) will require such access, and there are bills proposed in the U.S. House and Senate that would do the same.

But the mechanics how to govern such access is itself a complex subject that will require both the platforms and researchers to navigate difficult legal and ethical terrain.

As in most things related to tech policy, Europe is further ahead. For the past year, a working group run by the European Digital Media Observatory (EDMO) and the Institute for Data, Democracy & Politics (IDDP) at The George Washington University has endeavored to draft a Code of Conduct for how to provide researcher access under Article 40 of the EU’s General Data Protection Regulation (GDPR). The product of that effort was published today.

Rebekah Tromble, Chair of the EDMO Working Group on Platform-to-Researcher Data Access and Director of the IDDP, writes in an introductory note that the draft code “lays out careful guidance regarding the steps both researchers and platforms must take to ensure they are in compliance with the GDPR,” as well as “a framework for evaluating the relative risks involved in processing different types of data and then connects each level of risk to a number of organizational and technical safeguards to be implemented by platforms, researchers, and research institutions.”

Such a code should provide a roadmap for how to implement the requirements of the DSA, and could be very useful to policymakers and regulators in the US as they contemplate a similar mechanism. Working group members included representatives from a variety of universities and civil society groups, as well as executives from Meta/Facebook, Twitter, and Google. AWO, a consultancy that works on data rights and privacy matters, helped prepare the draft.

Notably, the working group suggests there is a need for an independent body to serve as a governing membrane between platforms and researchers. This intermediary would engage in a range of activities, such as reviewing research proposals, assessing risk, vetting researchers, conducting ethical reviews, and a range go other activities necessary to facilitate research in a manner consistent with the law. The working group believes such a body “is critical to smooth functioning of platform-to-researcher data access.”

4. One recommendation from this Working Group is that there is a need for an independent intermediary body that can perform a number of key governance tasks, such as pic.twitter.com/A2w8VDv3vy

— Mathias Vermeulen (@mathver) May 31, 2022

The code describes safeguards and methods to ensure data access and processing is handled appropriately, from ethical and methodological review, data minimization and pipeline auditing, through to the anonymization of user data. Higher risk projects might also require the use of access restrictions, encryption, and required data destruction. The highest risk projects may utilize virtual or physical data clean rooms, where special protections would be in place to prevent mishandling of user or platform data. The draft code also contains a variety of supporting documents, including templates for forms such as to initiate a data request or to conduct ethics or methodological reviews of a proposed research project.

In their concurrence letters, executives from Twitter and Meta/Facebook indicate their support for the effort, and specifically for the creation of an intermediary body between researchers and the platforms. “Meta supports creating such a body as part of the text of the Code of Conduct,” the company said in its letter, while Twitter noted it “will be critical to ensure effective operation and governance.”

One issue still under debate is what types of researchers can qualify to participate under the code. Twitter, in its concurrence letter, writes that this is one area where “further work is needed”:

The Code currently specifies that research must be conducted by an “entity which has as one of its principal aims the conduct of research on a not-for-profit basis [...]” This leaves open opportunities for entities whose principal aim is not the conduct of research to advance knowledge via peer-reviewed publication as is generally the case for university researchers, but for organizations whose goals include advocacy oriented research for the purpose of shaping public dialogue or influencing public opinion.

Rebekah Tromble told Tech Policy Press that in preparing the draft code, the team felt it was important "to be as inclusive as possible within the limitations of the GDPR." But, she says, those limitations are themselves significant. "GDPR offers a special 'carve out' for 'scientific and historical research' only, not for journalistic research and reporting. It also refers to methodological and institutional requirements that set a relatively high barrier for any researcher, even those at academic institutions." That means that while the language may be inclusive, the bar is still high.

"While the draft Code may not solve all of the issues stemming from this dual relationship platforms – researchers, which involves making personal data available and then using it for scientific research, it is the most complex and nuanced exploration of the issue I am aware of," wrote the Future of Privacy Forum's Dr. Gabriela Zanfir-Fortuna, who served as a member of the working group.

The draft code will now face scrutiny of other experts.


Justin Hendrix
Justin Hendrix is CEO and Editor of Tech Policy Press, a new nonprofit media venture concerned with the intersection of technology and democracy. Previously, he was Executive Director of NYC Media Lab. He spent over a decade at The Economist in roles including Vice President, Business Development & ...