Home

Donate
Perspective

There is More Online Election Discourse than Ever, But Researchers See Less

Josephine Lukito, Kaitlyn Dowling / Nov 7, 2025

This piece is part of “Seeing the Digital Sphere: The Case for Public Platform Data” in collaboration with the Knight-Georgetown Institute. Read more about the series here.

In the United States, politicians have increasingly incorporated social media into their electoral campaigning. Politicians seem to be on virtually every platform online, ranging from mainstream social media like Facebook and YouTube to ideological and niche social media like Truth Social. Even in the 2025 off-year US elections, political campaigns are producing a massive amount of advertising and promotional content. In New Jersey alone, Senator Cory Booker and gubernatorial candidate Mikie Sherrill have already posted over 2,500 times across five different platforms during the last 90 days, a number likely to rise in the final month of the election.

But archiving and sharing this discourse with citizens has become more difficult, as politicians are present across a wider variety of media than ever before.

Why is public data critical for elections?

The strategic goals of campaign rhetoric on social media are similar to the press releases, speeches, and fireside chats of the past. For politicians, communicating with the public is essential to gaining the support of voters. In today’s digital media ecosystem, social media platforms allow politicians to connect directly with voters, bypassing traditional gatekeepers such as journalists. Research has highlighted how politicians have leveraged social media to emphasize intra-party loyalty, track public attention to issues, and garner media attention.

Citizens have a right to know what politicians are saying, especially during elections, to make informed voting decisions. Yet the affordances of social media often make it easier for politicians to hide their statements, such as by deleting controversial posts. Efforts like Politiwhoops, which archived deleted posts from politicians’ Twitter accounts, highlight the importance of accessing, archiving, and preserving electoral discourse for the public. Platforms know this, too, as evidenced by Meta’s prior attempts to build Facebook and Instagram Crowdtangle-based dashboards during the European Union and Indian elections.

What is the current state of data access?

The current state of data access could be described as approaching a data desert. Platforms, once willing to work with researchers to provide data, have become more restrictive or have ended their data access programs altogether. Despite regulatory efforts such as the Digital Services Act Article 40.12, companies have been slow to comply or have developed highly restrictive, extremely vague “programs” that hardly provide any data access. When platforms do provide access, the data may be inaccurate or so limited that research of interest cannot be pursued.

One exemplar of this is Meta, which has launched many data access programs in the past, from Social Science One to Facebook Fort. In 2024, Meta ended access to Crowdtangle, a popular platform used by many researchers studying conversations about elections globally, right in the midst of the 2024 US elections. While Meta introduced a new data access program, the Meta Content Library as a replacement, its more stringent and limited data access makes it an insufficient substitute. These developments have made it even more challenging for researchers to study social media’s impact on elections.

But Meta is not alone. In the past few years, X has made its data access API far more costly, and Reddit ended its support of the archive project Pushshift, which was a critical source of Reddit data for researchers. Even worse, some platforms have pursued legal action against scholars conducting research that is critical of them. X, in particular, has attempted to silence researchers documenting the growth of hateful content on the platform. While data scraping approaches are an alternative, researchers have expressed concerns both about the legality and the practical challenges of maintaining data scrapers, particularly during high-activity times such as media events and elections.

A year after the 2024 US presidential election and elections in more than 60 other countries, little has improved. Despite efforts by policymakers, researchers, and journalists to advocate for greater data access, platforms continue to obstruct efforts to study public discourse.

What’s needed to improve access to public platform data?

Academic and civil society researchers depend on access to public platform data to understand how elected officials and campaigns use social media. However, the rollback of data access programs in favor of more limited offerings has resulted in a fragmented patchwork of programs and research tools that can be costly, overly technical, or require burdensome application processes.

Though the current landscape for accessing public platform data can best be described as a desert, it has never been particularly lush. Both today and in the recent past, research using public platform data has been limited and uneven. Not all platforms have made equal efforts to make data accessible to researchers and civil society, and most research has focused on Western, English-speaking democracies. There is so much more data that deserves our attention, particularly in under-researched countries and languages, across the whole spectrum of mainstream and alternative social media platforms.

The "Better Access Framework" lays out concrete standards for platforms that would enable researcher access to high-influence public platform data, making our online public sphere more observable and transparent across platforms, countries, and contexts. Such public data are most likely to affect voters, elections, and democracies at large. Making this kind of data available to researchers will enable greater transparency and a deeper understanding of our shared online environment, while balancing concerns about privacy and security.

How does the framework help advance that type of access?

The Framework identifies four types of high-influence data, two of which are most relevant to election research: highly-disseminated content and content posted by government and political accounts.

Highly-disseminated content concerns posts that have achieved exceptionally high reach. The Framework identifies explicit numerical reach and engagement benchmarks to address the challenge of clearly defining “viral” content. These thresholds enable researcher access to the most visible, relevant platform data, while mitigating privacy risks for the vast majority of accounts creating low-reach, low-engagement content.

Highly-disseminated content has the ability to shape public understanding of elections, candidates, and government and to influence agendas. During the 2024 US general election, clips from the presidential debate between Joe Biden and Donald Trump circulated widely across a number of mainstream and alternative social media platforms, amassing millions of views within hours. These clips shaped how voters perceived Biden’s performance and ongoing media narratives about his health and fitness for office. Viral moments, amplified by social media users and influencers, demonstrate how highly influential content significantly impacts the offline world.

Political and government accounts are those belonging to elected officials, candidates for office, major political parties, and institutions that directly influence governance. At a minimum, platforms should provide researcher access to data on heads of state, members of national-level legislatures, courts, and ministries/government departments through a proactive data interface, as these communications are inherently a part of governance. In addition, the Framework calls on platforms to identify and provide corresponding data on political candidates, major political parties, and party leaders. These communications directly influence elections and civic participation and warrant transparent, independent scrutiny. Through the 2025 elections, the importance of political and government accounts is apparent.

In New York, Zohran Mamdani’s mayoral campaign has used a video-heavy strategy across platforms like TikTok and Instagram to rapidly gain support among voters, leading him to beat former Gov. Andrew Cuomo in the ranked-choice primary. The Mamdani campaign has also capitalized on a unique Instagram strategy — “comment for a DM” —to make it easier to send links to followers via direct message. In response to comments, the campaign has sent as many as 77,000 messages with calls-to-action that supporters can take, but this novel strategy has remained understudied. Though direct messages are not part of the Framework’s definition of public data, comments on these types of posts represent an important form of public engagement, helping researchers understand how candidates mobilize supporters.

Identifying these political actors and providing the corresponding data through a data interface (such as a research API or searchable database) ensures that the data are made accessible widely, without requiring researchers to request the data individually. For content at the sub-national level, the Framework provides additional mechanisms for providing researcher access to political and government accounts through data requests or permission for independent collection when platforms are unable to provide data.

These mechanisms for data access — along with clear, feasible benchmarks for what qualifies as high-influence public platform data — provide a roadmap for platforms to implement meaningful transparency tools while also balancing privacy considerations. This empowers the research community to hold government and political power accountable and to shine a light on how platforms influence discourse in our digital age.

As elections become ever more mediated by online platforms, public understanding of how information flows, who shapes it, and how it reaches voters depends on persistent, rigorous, and independent research. The "Better Access Framework" defines what constitutes influential public data and requires it to be made available, offering a practical path forward for researchers and platforms. Implementing these standards would strengthen transparency and accountability, ensuring that digital spaces can be studied, understood, and improved for the public good.

Authors

Josephine Lukito
Dr. Josephine ("Jo") Lukito is an Assistant Professor at the University of Texas at Austin’s School of Journalism and Media and incoming Professor of Digital Communication in the Digital Democracy Centre at the University of Southern Denmark. She is also the Director of the Media & Democracy Data Co...
Kaitlyn Dowling
Kaitlyn Dowling is a senior researcher at the Algorithmic Transparency Institute, a project of the National Conference on Citizenship. She leads research on public data access, content analysis, and online civic discourse, working with academic, policy, and civil society partners to improve transpar...

Related

Perspective
The World’s Growing Information Black Box: Inequity in Platform ResearchNovember 7, 2025

Topics