Researchers release massive Twitter dataset of voter fraud claims

Justin Hendrix / Jan 22, 2021

"The voter fraud allegations to discredit the U.S. 2020 presidential elections are likely to form one of the most consequential misinformation campaigns in modern history," assert researchers at the Social Technologies Lab at Cornell Tech and The Technion in a paper accompanying the release of VoterFraud2020, a data set of election fraud claims on Twitter between October 23rd and December 16th last year.

Indeed, following the violent events of January 6th- sparked by false claims that the 2020 election was "stolen"- a substantial number of American adults do not believe the outcome was legitimate. In order to help researchers better understand how false claims were propagated, the researchers packaged a data set includes 7.6 million tweets and 25.6 million retweets gathered by manual and data-driven methods focused on key words such as "voter fraud" and hashtags like #stoptthesteal.

The accompanying analysis also raises key questions about the role of social media in spreading false claims. In its paper, the Social Technologies Lab notes that its findings do not necessarily align with claims in a widely cited October 2020 study by researchers including Yochai Benkler at Harvard's Berkman Klein Center, which concluded that the voter fraud disinformation campaign in 2020 appeared to be "an elite-driven, mass-media led process." Instead, the Social Technologies Lab notes "external links referenced by promoters of the claims point mostly to low-quality news websites, streaming services, and YouTube videos. Some of the widespread videos claiming ‘evidence’ of voter fraud were published by surprisingly small channels."

The research also find that key clusters of users drove a great deal of the volume of voter fraud claims. Individuals like Lin Wood and Sidney Powell, for instance, only rivaled former President Donald Trump in terms of retweets during the period. While the data set does not represent the full volume of activity on Twitter, in the representative sample Trump's tweets were retweeted more than 1.5 million times, Wood achieved more than 1 million retweets in the period; Sidney Powell more than 600,000.

Visualizations accompanying the findings depict the interaction between accounts promoting voter fraud claims and detractors countering those claims.

Real time information warfare: "Five communities in the retweet graph of people posting about voter-fraud claims; the blue cluster on the left side includes mainly detractors of voter-fraud claims."

The researchers also retained full data from 99,884 subsequently suspended users- part of a Twitter takedown of offending accounts, many from the QAnon community. A related visualization highlights the suspended sub-community, mostly of QAnon supporters. Kate Starbird, a researcher at the University of Washington who helped lead a coalition of researchers concerned with disinformation around the 2020 election called the Election Integrity Partnership, confirmed the findings were similar to her own.

The top domains referenced by promoters of voter fraud claims include publications such as Gateway Pundit, Fox News, Epoch Times, The Federalist, and lesser known websites such as DavidHarrisJr.com and TheDCPatriot.com, along with video platforms including Periscope and YouTube. The top domains referenced by detractors include The Washington Post, CNN, RawStory, The Independent and the New York Times.

Indeed, while the paper is based on data from Twitter, an analysis of YouTube urls permits the authors to consider the role YouTube played in the broader disinformation campaign. Some 12% of all tweeted URLs in the dataset direct to content on YouTube. "Despite YouTube’s announcement that it will take actions against content creators who falsely claim the existence of widespread voter fraud, as of Jan 11th, the top 10 channels and videos listed in our tables are still available on YouTube," the authors note.

Visit the full dataset here.

(Disclosure: Justin Hendrix collaborates with one of the VoterFraud2020 researchers, Mor Naaman, on a course jointly taught at Cornell Tech and NYU.)


Justin Hendrix
Justin Hendrix is CEO and Editor of Tech Policy Press, a new nonprofit media venture concerned with the intersection of technology and democracy. Previously, he was Executive Director of NYC Media Lab. He spent over a decade at The Economist in roles including Vice President, Business Development & ...