Home

Donate
Perspective

Why Platform Data is Essential to Public Health Efforts in Tobacco Control

George Pearson / Nov 10, 2025

This piece is part of “Seeing the Digital Sphere: The Case for Public Platform Data” in collaboration with the Knight-Georgetown Institute. Read more about the series here.

Public platforms have become a vital part of public health, as researchers seek to study the effects of the platforms themselves as well as better monitor and understand other systemic determinants of health. Tobacco, the leading cause of preventable death in the United States, is one such topic where platform data is becoming increasingly crucial, but where data access and use remain with significant challenges.

In the United States, 1.97 million (10%) of high school students report using a tobacco or nicotine product. As young people have moved to online platforms, the tobacco industry has moved with them. Spending on social media advertising by the top five e-cigarette companies doubled between 2018 and 2020, and tobacco brands maintain a large presence on social media.

Platform data and the evolving landscape of tobacco control

The tobacco industry uses online platforms not only to promote their products to new and existing customers, but also to improve their public image and oppose regulation. Research has found that 24 of the 58 most retweeted Twitter accounts posting about tobacco policy had ties to the tobacco industry.

In recent years, researchers across the globe have produced a wealth of tobacco control research utilizing platform data. Truth Initiative, the United States’ largest nonprofit public health organization dedicated to preventing youth and young adult nicotine addiction, has been part of this effort. Work produced by myself and my colleagues has:

This work has had a significant impact, informing congressional reports, state lawsuits against vape manufacturers and distributors, and data sheets from the US Department of Health and Human Services.

Much of the existing research utilizing platform data has focused on monitoring the marketing tactics of the tobacco industry. For example, the Truth Initiative’s analysis of 15 brand-owned Instagram accounts found a range of youth-appealing themes, with only 13% of posts including compliant warning labels. In other work, we used platform data to show how the e-cigarette brand Geek Bar has targeted EDM music festivals as a form of experiential marketing.

Platform data can be used in other ways. Social listening enables researchers to better understand public health issues, from individuals’ experiences quitting vaping to public sentiment surrounding new tobacco control policies. Platform data also helps public health groups to assess the effectiveness of digital campaigns promoting cessation and prevention messaging.

Data barriers in public health research

Restricted access to and limitations in available data continue to hinder public health research. Work by my colleagues using Reddit data to better understand people’s experiences of quitting vaping, or to document tactics used to circumvent product bans, has become untenable since the closure of the Pushshift API. Similarly, work exploring how vaping advocates disseminated misinformation during the COVID-19 pandemic, or organized campaigns against bans on flavored vapes, would not have been possible without the Twitter Research API.

Even when platforms open up data access, their tools can be inaccurate. An audit of the TikTok Research API that I conducted with colleagues at the Truth Initiative and an additional collaborator found major discrepancies in metadata: the vast majority of videos returned by the API had over 90% of views missing compared to what was viewable to users on the platform. In one case, the API underreported views by more than 5 million.

The unique nature of public health research can also raise difficult decisions and novel challenges when using public platform data.

Measuring exposure is one such challenge. While youth consistently report being exposed to tobacco content online, measuring what content is viewed by different groups is difficult because their feeds are generated by opaque, proprietary algorithms.

Understanding how these algorithms impact exposure to content is especially important for tobacco control. The tobacco industry has a long history of tailoring its messaging to specific demographic groups, from targeting African Americans with menthol cigarettes to crafting unique campaigns for the LGBTQ+ community.

As such, any attempt to understand demographic differences in exposure to tobacco content requires understanding the algorithms governing online platforms.

Such research is possible. In a recent study, my colleagues and I found that while watching anti-tobacco videos on YouTube, 13% of the videos the algorithm recommended in the sidebar led to pro-tobacco content. Other work using puppet accounts has shown that user demographics appeared to impact the videos returned by the YouTube search engine. However, further work and new methods, such as data donation, are needed to advance this area of study.

Ethical and regulatory challenges in platform research for public health

Tobacco control research also faces unique ethical and methodological challenges. For example, tobacco control research must deal with the international nature of public platforms. Although tobacco marketing in the United States is governed by regulations created by the Food and Drug Administration (FDA), a user in the United States can still view content produced by a tobacco brand account nominally producing content for other regions. Internal emails from Juul, an e-cigarette brand closely associated with the youth vaping epidemic, show that brands are often aware that accounts designed for one country can have an international following. As such, projects monitoring brand accounts have to assess the relevance of internationally focused accounts, balancing the desire to be comprehensive against concerns over collecting unnecessary data.

Another example of ethical challenges exists in data retention. Three months after the aforementioned article on Geek Bar’s promotion of music festivals was published, Geek Bar’s account disappeared from Instagram, removing all content we had analyzed. Similarly, Juul removed its Facebook and Instagram accounts after growing scrutiny of its social media marketing.

Many of the social media industry’s proposed solutions to allow public interest access to data involve removing content from datasets as soon as it’s deleted from the platform. This is entirely understandable from the perspective of protecting individual user privacy. However, the same account types used by private users are used by multinational companies with significant impacts on public health, granting them the same right to have content erased. This renders further analysis impossible and can hinder regulatory action.

The decision by the public platforms to afford businesses the same data rights as private users can also cause relevant accounts to be missing entirely from research tools. While pilot-testing the Meta Content Library in early 2024 (we have not tested the library since, and cannot speak to subsequent changes), we found that if an account, in its settings, required viewers to have a self-reported age of 18 or older, it was entirely absent from the library. As a result, of the 40 brand-owned accounts examined, only 12 were available through the library.

Of course, only a minority of pro-tobacco content posted online comes from brand-owned accounts. An important area of tobacco control research involves surveilling organic, influencer, and affiliate content; a task made more difficult due to the policies of online platforms. For example, Instagram’s branded content tool is designed for influencers receiving compensation to declare financial connections. However, tobacco products (including vapes) are prohibited from using the feature. The result has been that most e-cigarette influencers simply don’t declare financial connections. Such disclosures are required by the Federal Trade Commission (FTC); however, work by researchers at Truth Initiative found that among 262 vape-related influencer Instagram posts, only one complied with FTC disclosure guidelines.

Affiliate programs, in which users share product links and earn commissions on sales, are also widespread. E-cigarette brands, including Juul and Juicehead, have run affiliate programs, as have online retailers. Such programs often blur lines between “organic” fan-created content and paid advertising. Additionally, content produced by affiliates is unlikely to contain warning labels and may target vulnerable groups.

Differentiating between genuine organic content and content produced by those financially compensated by the tobacco industry has become an area of growing interest, especially in light of the recent rise of the oral nicotine pouch Zyn. A wave of young adult content creators (self-described as “zynfluencers”) have posted humorous Zyn videos to online platforms and garnered hundreds of millions of views. Philip Morris International, the manufacturer of Zyn, says they don’t partner with social media influencers, suggesting viral Zyn content is part of a passionate subculture. However, an analysis of the top 100 Zyn videos on TikTok found that 9 of the content creators had links to retailers selling Zyn, suggesting a financial incentive, albeit one not tied directly to the manufacturer.

Reframing platform data access as a public health Issue

Public health is a perspective often missing from discussions of platform data, which tend to focus on issues such as extremism or political misinformation. Yet, the tobacco industry continues to use public platforms to promote products and influence policy. This, combined with platform use by other industries, such as alcohol and gambling, and the documented effects of platform use on mental health, underscores that large public platforms are an important determinant of health. Public health must be considered in discussions about platform data.

Work continues to identify platform features and content that are harmful to public health, while also leveraging these platforms to improve public health via more effective campaigns and behavior insights. For this work to continue, access to transparent, accurate platform data is vital.

Authors

George Pearson
George Pearson is a Senior Research Manager at the Schroeder Institute, the research wing of the Truth Initiative, a public health non-profit based in Washington DC focused on preventing nicotine addiction. His research focusses on the role of big tech and digital platforms in the marketing and sale...

Related

Perspective
The World’s Growing Information Black Box: Inequity in Platform ResearchNovember 7, 2025

Topics