A Module Playbook for Platform-to-Researcher Data Access

Chris Riley, Susan Ness / Nov 20, 2022

Chris Riley is a Distinguished Research Fellow with the Annenberg Public Policy Center of the University of Pennsylvania. Susan Ness is a former member of the Federal Communications Commission and a distinguished fellow at the Annenberg Public Policy Center of the University of Pennsylvania.

Modularity is a form of multistakeholder, co-regulatory governance, in which modules—discrete mechanisms, protocols, and codes—are developed through processes that include a range of perspectives. This novel co-regulatory approach proposes to achieve greater multinational alignment in internet governance, while respecting the inherent sovereignty of national and regional governments, some of which have already adopted new digital platform accountability regimes. Modularity works by identifying tasks common to laws in multiple countries and creating global, multi-stakeholder processes and institutions that can operationalize those tasks. It facilitates compliance with different jurisdictions and reduces the cost of enforcement, all without the necessity of treaties or other mechanisms that constrain or replace official authority.

By fostering multinational alignment of norms and practices, modularity offers digital platforms and their users greater operational consistency and clarity, and reduces government outlays for drafting customized rules for the same function. Moreover, this alignment can help governments develop policies in an inclusive manner, rather than following the lead of early movers whose solutions are designed principally for their own legal and regulatory environments and not the collective whole. And yet, it provides room for subsequent tailoring according to local and regional needs.

Modularity’s potential for improving international internet governance seems clear. However, a fuller articulation is warranted of how to implement modularity in practice, both to identify areas needing further work and to encourage its adoption. We have chosen the policy issue of platform-to-researcher data access to illustrate how a modular regime would work. This “Playbook” describes the functions a module might perform, suggests bodies that could perform those functions, and lays out the roadmap for empowering such bodies within the context of law.

A modular approach to researcher access would include:

A new multistakeholder, international body that would develop a standard for both vetting researchers and reviewing platform access policies;
A body to operationalize the vetting standard in individual cases; and
A body to mediate disputes that arise between researchers and platforms.

The same or different entities could be deployed to handle these functions.

A starting place for consideration of the nature and structure of such a body, as well as for the substance of a suitable standard, is in the recent report of the European Digital Media Observatory (EDMO). It proposes that an Independent Intermediary Body (IIB) be established to approve research requests for access to platform data consistent with data protection requirements under the General Data Protection Regulation (GDPR) of the European Union. Critically, the bodies set up to carry out these functions would not directly enforce violations of underlying law, but rather would reduce the need for government enforcement by facilitating agreement between the platform and the researcher. Where agreement is not reached, the modular body could provide the regulator with a developed record. And in the context of the Digital Services Act (DSA), which includes provisions regarding platform-to-researcher data access, a delegated act could empower entities operating in a module to assume these transactional duties.

1. Problem statement

One of the most well-developed issues in the digital platform governance space is that of researcher access to platform data. Substantial voluntary efforts have conducted deep dives on the possibilities and pitfalls of providing independent researchers with greater access to internal platform data, beyond what is made available to an ordinary platform user or through existing application programming interfaces (APIs) designed for researchers. However, the execution of these projects has typically been fraught with complexity and delays, leaving the independent researcher community essentially without effective access to sufficient platform data to study a variety of important questions.

This Playbook will not summarize or repeat the extensive work that has been done by others analyzing the platform-to-researcher data access problem. The August, 2022 report by Caitlin Vogus of the Center for Democracy and Technology provides a useful overview. It discusses the nature of the requested data, the vetting criteria for researcher identification and verification, and methods for providing access.

Several governments have proposed laws that would mandate researcher access to platform data, subject to vetting of both the researchers and the research plans. For example, in the U.S., the Platform Accountability and Transparency Act (PATA) bill would create a data access mandate along with a new government office and process dedicated to vetting researchers and overseeing their engagements with platforms. Article 40 of the DSA requires platforms to provide data access to vetted researchers upon request from a government enforcement authority. While the work-in-progress Online Safety Bill (OSB) in the United Kingdom does not reach as far, it nevertheless obligates the regulator Ofcom to produce a report on the issue and gives authority to Ofcom to establish guidance for platform practices related to researcher access.

As these laws are adopted and regulatory mechanisms are developed, common implementation challenges have emerged. Because of the similar purpose of the laws, every country implementing a version will face the same questions regarding how to vet researchers and research projects, how to handle privacy issues and safe harbors, and how to review platform access policies. This module outlines the series of objectives, operations, and entities capable of taking on the administrative implementation tasks common to present and future laws, without undermining the underlying individual legal obligations, nor compromising the enforcement authority of each jurisdiction.

2. Example of implementing legislation under DSA

While a platform-to-researcher data access module is intended to function in multiple jurisdictions, understanding how it will work benefits from reference to the DSA as an example of such integration. Article 40 of the DSA sets out a general obligation for data access, as further specified in the legislative text:

Upon a reasoned request from the Digital Services Coordinator of establishment, providers of very large online platforms or of very large online search engines shall, within a reasonable period, as specified in the request, provide access to data to vetted researchers who meet the requirements in paragraph 8 of this Article, for the sole purpose of conducting research that contributes to the detection, identification and understanding of systemic risks in the Union, as set out pursuant to Article 34(1), and to the assessment of the adequacy, efficiency and impacts of the risk mitigation measures pursuant to Article 35.

Successful execution of this provision involves the following steps:

A researcher (individual or group) prepares a plan and submits it for consideration to the applicable Digital Services Coordinator (DSC), a designated government agency that operates within a member state with specific responsibilities under the DSA. The submitted plan must include the resources or APIs for which access is needed (Article 40(7)).
The DSC reviews the request under the legislative criteria in Article 40(8) and, if approved, designates the researcher as vetted and sends the request to the platform.
The platform has 15 days to object to the request on grounds of technical or security concerns, and must offer an amended request with different methods or data suitable to the purpose of the request (Article 40(5)-(6)).
The DSC will respond to this request within 15 days with a final plan and timetable for compliance by the platform (Article 40(6)).
The Commission is directed in Article 40(13) to develop delegated acts (i.e. new laws, but not subject to the processes or proactive approval of the Parliament or Council) that specify in more detail the “technical conditions” of providing access, including “where necessary, independent advisory mechanisms,” in part to ensure compliance with the General Data Protection Regulation (GDPR).

Collectively, the text of the DSA leaves open clear opportunities for normative and operational activity to contribute to implementation of the law’s objectives.

3. Module Objectives

A module, consisting of a set of operations executed by one or more bodies, can help support implementation of the DSA while simultaneously contributing to the implementation of future laws in other jurisdictions, resulting in normative and operational alignment in the middle layers of each country’s regulatory process. This section will describe the objectives of such a module, and the next will describe the steps to implement these objectives.

A module for vetting researcher access to data would fulfill three high-level objectives: (1) establishing (and evolving over time, as necessary) a standard of behavior for platforms and researchers; (2) operationalizing the standard to vet research requests and review platform access in practice; and (3) mediating disputes that arise in specific cases.

3.1 Establishing a standard

One objective of a module is to develop and maintain a single common standard of behavior or process that can be used across multiple legal jurisdictions. In this context, “standard” does not mean a formal or technical standard, but the language must be clear and substantial enough to serve as a roadmap for evaluation, with enough specificity to support rejection of proposals where warranted.

In the context of researcher access, this standard would likely include the level and nature of access to data made available by platforms; platform and researcher policies to protect against abuse; and commitments by researchers regarding use of the data, including structural conditions such as independence. Ideally, the module would receive prior approval (or acknowledgement) from relevant regulators that platform and researcher compliance with the standard would suffice for various specific legal requirements; however, even if there is no acknowledgement, compliance with the standard should mitigate the risk of violation.

It will be challenging for a module to craft a single standard that fully aligns with the legal requirements of multiple jurisdictions. For example, the right to access platform data provided by the DSA covers research examining platform compliance with risk assessment and mitigation requirements, albeit broadly defined (Article 34(1)). But legislation in other jurisdictions might permit vetted researcher access for a wider range of topics, including ancillary research that does not fit within the DSA’s parameters.

While the global regulatory conversation around researcher access is still at a relatively early stage, there will likely be some outliers -- future laws that impose significantly narrower or broader access categories, conditions, paperwork, and scope. However, building a module at this early stage of regulatory development creates a “gravity well” effect, in that government systems that are “close enough” in their outputs and expectations for researcher vetting processes, for example, will have incentives to align more when it opens the possibility of a common process and reduced implementation and ongoing costs. This supports and extends intentional discussions of shared approaches, such as between the US and EU through the Trade and Tech Council (TTC).

Module execution can facilitate and strengthen that gravity well effect by inviting contributions from a broad range of stakeholders in the process of developing a common standard, including experts from multiple countries and participating governments. Proactive transparency and outreach to relevant stakeholders will improve the outcome.

3.2 Operationalizing vetting researcher access requests and platform access

After a standard is developed, the module must apply the standard in practice, evaluating a proposed research plan for necessary safeguards and other commitments as well as examining the scope, policies, and level of access to data offered by a platform to researchers. This may result in providing direction to the researcher or the platform to make changes to comply with the standard.

Should either a research partner or platform be unable or unwilling to make changes necessary to be validated by the module standard and move forward with the research, the module objective cannot be achieved, and the process terminates. If the platform is alleged to be in non-compliance with underlying legal obligations in a relevant jurisdiction, the record developed through the course of evaluating the platform’s scope, policies, and level of access to data would be made available to the relevant regulator. Similarly, if a researcher is offering substandard safeguards and commitments, then the record developed through the course of module operation would be made available to the relevant regulator if the platform chooses to seek redress.

3.3 Mediating disputes

A third purpose of a platform-to-researcher data access module is to mediate specific disputes that arise after a research plan has been approved and the research is undertaken. For example, a researcher might mishandle data once received, or a platform might not offer the level of access originally agreed upon. While the authority to sanction researchers or platforms for violations of underlying law should lie with individual government regulators, a module can assist by resolving some disputes through mediation and seeking compromise. Importantly, this function would be limited to mediation and voluntary resolution, and not any form of arbitration imposing an outcome against one party’s will. In circumstances where mediation is not effective and, in particular, where legal sanctions are potentially warranted, the module can develop a record with supporting evidence to streamline enforcement processes.

4. Operations

This section will describe in greater detail the steps involved in executing on these objectives, laying out a more complete roadmap to build a module for researcher access to platform data. The next section will highlight current and future bodies or structures that could be crafted to carry out the necessary operations. For a module to implement the objectives of establishing a standard, vetting requests, and mediating disputes, it would need to perform the following operations in the setup and ongoing periods of performance:

4.1 Developing a standard (Setup and ongoing)

Developing a standardwith clear behavioral guidance including standardized data protection and security expectations as well as a fitness for purpose test, to vet researchers and research plans as meeting both privacy and data access expectations.

The module’s activities should include periodic review and modification of the standard over time, based on experience. Ideally, the multi-stakeholder body assembled to develop the standard should continue in existence to conduct periodic reviews and update the standard as warranted.

There are many approaches to create the body tasked with crafting the standard. For this Playbook, we have assumed that informal discussions would occur among expert stakeholders (academics, industry, civil society -- and possibly representatives of participating governments) already working on this issue to assemble a diverse working group that would draft the research module, and concurrently engage in back and forth discussions with one or more governmental authorities for recognition. We assume that a single body ultimately will be responsible for this operation.

We have not yet drilled down on this crucial element, but offer in Section 5 Building Blocks some examples of institutions and scenarios for additional research.

4.2 Developing a process (Setup)

Step two is developing a process, and creating the body that will run the process, to vet and approve (or reject with feedback) proposals for researcher access under the standard adopted in 4.1.

We have listed the development of the process for implementing the standard as a distinct modular function. But because the substance of the standard and the process applying it are intertwined, it is likely, but not necessary, that the same body created to develop the standard will also propose a process to apply it.

Alternatively, the body developing the standard (operation 4.1) could designate an outside working group, or a specific subset of its overall membership, to develop the process for applying the standard. If a smaller team or separate body is tasked with designing the application, then the body creating the standard should review and approve its output.

Notably, the process of applying the standard must take into account legal requirements in jurisdictions where government recognition is sought. For example, the Digital Services Act sets out specific processes and timetables by which the member state DSC of record must evaluate a researcher access proposal. If vetting a research proposal under the standard takes longer than the timetable in the DSA, then the module will not comply with EU law.

4.3 Executing the process to evaluate researcher plans and platform access compliance (Ongoing)

A process to evaluate researcher qualifications and plans and access compliance under Operation 4.2 should aim for simplicity, be appropriately expeditious and, most importantly, scalable. The administrative function of vetting individual researchers and data requests can involve a smaller team of reviewers, under the oversight of a board or other body, working under a fixed timetable.

In contrast to the work of the body setting up the system for vetting researchers, (Operation 4.2), applying a standard in a specific instance needs to occur more rapidly, and consequently is best executed with a smaller group of individuals and stakeholders, working to a predetermined and fixed timetable.

The body responsible for designing the standard by which researchers and research projects would be evaluated (Operation 4.1) could designate a smaller subset of stakeholders or a separate structure to carry out the actual vetting and evaluation work (Operation 4.3), and set reporting requirements.

Working together, the body responsible for vetting researchers (Operation 4.3) and the body responsible for designing the standard and bodies (Operation 4.1) should periodically examine whether the vetting process or the standard/guidelines should be modified based on experience.

After approval of a research request and a platform’s stated plans for compliance, additional disagreements may occur over time including, in particular, a platform’s failure to deliver the relevant data or a researcher’s failure to adhere to data security and other usage restrictions. Either of these can result in violations of one or more underlying legal obligations for the parties, in addition to violations of the standard. The module should include a mechanism for the parties to mediate disputes as they arise in order to resolve disagreements whenever possible without needing government referral. Operation 4.4 develops a mediation protocol, and Operation 4.5 describes how a mediation regime would work in practice.

4.4 Developing mediation methods (Setup)

The next step is developing mediation methods to review researcher assertions of insufficient access provided by platforms, as well as platform assertions of researcher protocol violations.

The body designing the standard (Operation 4.1) could also develop a mediation methodology, or it could delegate that task to a different entity. In theory, multiple methods or processes of mediation, developed by multiple bodies, could be workable within a single researcher access module. Each method should be validated by the body responsible for designing the research standard (Operation 4.1) or its designate. The system should be designed to scale, and be available for researchers and platforms in the jurisdictions that have recognized the module.

4.5 Operationalizing mediation in specific instances (Ongoing)

As we stated in the case of operationalizing the vetting system for researchers (Operation 4.3), applying a process for mediation requires adherence to tight timelines. The same body that designed the mediation process (Operation 4.4) could also run the process or, alternatively, a subgroup could be tasked with running the process.

In practice, multiple bodies could operate in parallel as independent mediators, each authorized to take on the mediation process (Operation 4.5). All that would be required for a voluntary mediation process would be the acceptance of the mediator by both the researcher and the platform, and the mediator’s agreement to abide by the process.

The body set up to oversee the mediation system could receive complaints from either platform or researcher, evaluate them including consulting with the other party, and propose remedial measures if determined to be appropriate to ensure compliance with the standard and the accepted plan. If the parties accept these remedial measures as sufficient to resolve the complaint, and agree as applicable to execute them, then the matter is considered resolved.

Recognizing that in some circumstances remedial measures may be insufficient or may not be implemented, the module should develop a complete record of the dispute, together with the activities by platform and researcher as well as the initial research plan and review, containing sufficient detail to allow authorities in relevant jurisdictions to proceed according to its rules.

5. Building blocks

This section will highlight examples of multistakeholder institutions that could potentially take on some of the operations of a researcher access module, or are useful concepts or examples to consider in developing new bodies. We then acknowledge other critical areas that require significant work.

The composition of the bodies executing each of the operations above will differ by the operation, and may evolve over time. Some functions may be combined within the same body: for example, developing a standard (operation 4.1); and the process for operationalizing the standard (operation 4.2).

The EDMO report articulates the need for an “independent intermediary body” (IIB) tasked with implementing the code proposed in the report, specifically noting that an IIB could take on the operational function of vetting researchers and research plans. Such an intermediary could serve as a module for researcher access to platform data writ large, and not just for the purpose of complying with the GDPR data protection, which is the general focus of the EDMO report. The recommended IIB is the closest existing proposal to the concept envisioned in this playbook.

A project by the Carnegie Endowment and Princeton University proposes the creation of an international Institute for Research on the Information Environment (IRIE), modeled in part on the European Organization for Nuclear Research (CERN). U.S. Congresswoman Lori Trahan (MA-3) tabled a similar proposal. Such an institute could serve as a hub for data and technical resources to facilitate research. The Forum on Information & Democracy separately proposed the creation of an “Observatory on Information & Democracy,” modeled on the Intergovernmental Panel on Climate Change (IPCC). Both IRIE and the Observatory could support global socio-technical research capabilities and coordination, and partner with policymakers to improve government understanding of the information environment.

A globally diverse group of NGOs has established the Action Coalition on Meaningful Transparency to bring together a wide range of actors working across digital transparency issues, including access to data for research purposes. The ACT’s broad mandate and membership could position it to help coordinate researcher access to platform data. Building upon the relevant work product of knowledgeable professional groups would add value and reduce the time needed to be operational.

Models and structures from other areas of governance are also instructive. For example, the Global Network Initiative brings together 85 academics, civil society organizations, ICT companies, and investors from across the world and facilitates the confidential sharing of non-public information by companies, which is then used to evaluate their methods for responding to government demands and restrictions. The Global Internet Forum for Counter Terrorism (GIFCT) may also be a useful example for creating a forward-looking structure for multistakeholder engagement. GIFCT has developed mechanisms for coordination among stakeholders from multiple sectors, to facilitate a rapid response during crises. Both GNI and GIFCT could provide lessons useful for the challenge of developing a shared platform-to-researcher data access standard incorporating many different perspectives.

In the United States, the Financial Industry Regulatory Authority (FINRA) is an example of an effective self-regulatory, multistakeholder organization, capable of developing and enforcing rules for the financial industry within U.S. law through statutory authority and oversight by the Securities and Exchange Commission (SEC). A private company, FINRA chooses its own board of directors: almost half are elected by the members of the securities industry and a simple majority are selected by the board as independent, public company members. FINRA is funded by the members of the securities industry, who must be licensed by FINRA in order to trade securities.

More recently, the Horseracing Integrity and Safety Act of 2020, introduced by Senator Mitch McConnell (R-KY) and adopted as part of a subsequent appropriations act, establishes a self-regulated nonprofit corporation with authority to set certain rules for horse racing. While such an approach would require separate legal action in multiple jurisdictions, a single entity could in theory receive delegated authority or recognition from multiple sources to support its activities.

Regardless of the nature and approach of such future institutions, part of the theory of modularity is that value-added functions -- such as the vetting of researchers for responsible data practices and the evaluation of platform access policies and practices for sufficiency to enable research -- may be used in jurisdictions where legal instruments and obligations are not as fully specified or finalized as the DSA is in Europe. Facilitating research through the use of intermediary bodies -- even if not required by law -- offers substantial benefits for researchers (including those in the Global South), governments, platforms and the public.

In order to be effective, some of the operations outlined in this Playbook -- specifically executing the vetting process and mediation function -- must be designed to be scalable. More thought is needed about the design, staffing, leadership, and permanent funding of such bodies, including the credentials necessary for the vetters, capacity to support multiple languages within the bodies, and the geographic location of the operation, if any.

Similarly, much more work is needed on funding mechanisms to sustain the proposed bodies. Financial support for both initial setup of the module and for forward-looking operations could come from a variety of sources, including industry contributions (whether provided voluntarily or as a result of legal obligation), government funding (potentially as a share of platform fees supporting regulatory frameworks), or philanthropic grants.

6. Getting to Yes

To achieve the best results from a modular system, government regulators should accept the module with formal or informal recognition within their regulatory regimes. In the United States, United Kingdom and other jurisdictions where laws have not yet been passed, there remains the possibility for outside multistakeholder bodies to seek and receive ex ante approval from the relevant regulators. In the EU, the DSA leaves some room, but further action may be necessary.

In particular, as noted above, Article 40(13) of the DSA directs the Commission to adopt delegated acts to address the “technical manners of providing access” and to incorporate external input into questions of GDPR compliance in particular. The Commission should be encouraged to adopt such delegated acts as necessary to empower multijurisdictional operating bodies with the tasks of (1) developing guidelines for researcher vetting; (2) evaluating proposals for access and complaints that arise after access is granted; and (3) mechanisms for submitting a developed record of evidence to the Commission or the relevant DSC for further action. (Note that violations of the GDPR would likely require separate processes.)

Such a delegated act would need to operate within the DSC’s legal obligation to respond within 15 days. Given such a short timeframe, the delegated operating body would most likely build a record of evidence and make an initial judgment, and the DSC would lightly review and confirm the body’s judgments. The DSC could later assess the operating body’s performance and submit a report to the Commission and the Board.

Modules can first be created through the bottom-up energy that flows from effective multistakeholder processes, resulting in the design of individual bodies that can take on normative and operational functions. This alone contributes to information sharing and collective normative development, resulting in improved practices and, in this example, increased researcher access to platform data with appropriate safeguards.

But modularity offers additional value: Once such a body is established, it can be offered as a solution for governmental processes, including both the “delegated act” structures of the European Commission as well as the “self-regulatory” possibilities under consideration in the U.S. Congress, encouraging these processes to begin from an assumption of legitimacy of such a body. Of course, in Europe under the DSA, the authority to determine legal compliance would ultimately remain with relevant state DSCs. Similarly, the U.S. Congress could empower a module-implementing body with authority to oversee data access, while remaining subject to oversight by the Federal Trade Commission (FTC), or an agency within the executive branch.

A delegated act by the European Commission that recognizes an independent intermediary body such as that proposed by the EDMO report, and a corresponding legislative vehicle in another jurisdiction such as Canada, Australia, or the United States that empowers the same body to fulfill the same functions, would effectuate the goal of modularity. It would facilitate standardized, multijurisdictional platform-to-researcher data access. Additionally, democracies with fewer resources for legislative and regulatory development and enforcement, or without explicit data access mandates, would have an incentive to collaborate, conveying legitimacy to a shared global “delegated authority” and perhaps gaining a seat at the table as norms and practices are developed at larger scale. Over time, the efficiency and merits of working together should offer substantial rewards.

Authors

Chris Riley

Chris Riley is Executive Director of the Data Transfer Initiative and a Distinguished Research Fellow at the University of Pennsylvania’s Annenberg Public Policy Center. Previously, he was a senior fellow for internet governance at the R Street Institute. He has worked on tech policy in D.C. and San...

Susan Ness

Susan Ness is a distinguished fellow at the Annenberg Public Policy Center (University of Pennsylvania), where she leads the Modularity Project, a co-regulatory tool for democracies to align on digital governance despite different regulatory frameworks. Previously, she convened the Transatlantic Hig...