Lessons Learned from Red Teaming a Newsbot
Elise Silva, Madeline Franz, Sodi Kroehler / Jun 13, 2025While AI has been in news headlines plenty, AI is also increasingly in the news—generating content, suggesting edits, and aiding in information discovery. For example, The New York Times recently started allowing employees to use AI for some tasks, such as generating SEO headlines and summaries.
An under-discussed application of AI in the news, however, is the increasing integration of AI into news organizations’ websites as user-facing products that can perform a variety of tasks, including interfacing with users searching for information, summarizing news articles, or answering questions in real-time. And this isn’t just happening on news websites, but on a host of other government and public-facing web pages too, where individuals are interfacing with bots as information finding tools.
With AI overviews on web searches that leverage generative AI technology to essentially skim off quality reporting, synthesizing it for readers without redirecting web traffic to news sites, The Wall Street Journal reports that newsrooms are forced, yet again, to rethink their business models. As Sarah Grevy Gotfredsen wrote last year regarding Time’s new AI chatbot, “when newsrooms build their own chatbots, it at least gives them some control back over how their content is attributed and cited.”
But AI technology is not a neutral technology—it is charged with social biases and deployed within socio-technical contexts. As such, news chatbots are not neutral tools; they actively shape how users experience, interpret, and respond to the news, perhaps even influencing trust in the news organization itself.
In this sense, even a cautious approach to deploying a news bot will inherently bring overlooked risks and unintended consequences because, no matter how careful, the newness of the information interaction experience introduces unfamiliar dynamics of authority, expectation, and interpretation that are difficult to anticipate, and even with the most tightly constrained chatbot, impossible to fully control. Ultimately, readers will respond in ways no organization (or system) can fully predict.
The challenge of balancing safety with effectiveness
In 2024, the University of Pittsburgh Institute for Law, Policy, and Security worked with a local independent news organization, Spotlight PA, to red-team its experimental Election Assistant. That is, we were asked to independently check the AI system for potentially dangerous behaviors it might exhibit when interacting with users regarding their election-related questions. Spotlight’s approach was cautious and thoughtful, and we found its tool to be relatively safe as the team built the election assistant around predefined question-and-answer pairs rather than a system that used more generative properties of unconstrained chatbots. Through red-teaming before launch and analyzing over 3,000 queries and responses leading up to the 2024 elections, we identified several key lessons—highest among them, the challenge of balancing safety with effectiveness. We learned that the more constrained an AI-powered interface is, the less likely it is to provide incorrect information; however, it is also less likely to be able to answer a wide array of user queries.
Right now, generative AI technology may not be sophisticated enough to strike the right balance between precision and flexibility. To ensure a generative AI-powered news tool is safe, especially for sensitive and high-stakes topics like election information in a swing state like Pennsylvania, developers often have to restrict what it can do. But the more narrowly the system is focused, the less helpful it becomes across the wide range of questions users might ask. Generative AI systems are built to be open-ended and adaptive; the more constraints added to reduce the risk of harm, the more the very capabilities that make the technology appealing in the first place are limited. AI companies deploying frontier models are aware of this tradeoff, framing it as a helpfulness vs. harmfulness tradeoff.
Newsrooms will have to face this conundrum and associated tradeoffs as they decide how and when to deploy these technologies. Do they want a highly adaptive tool that can carry on dynamic conversations with users, knowing that flexibility introduces risk? Or would a more controlled, limited system—like one built around predefined question-and-answer pairs (which was Spotlight PA’s approach)—better serve their needs? There may be no perfect answer yet, but asking these questions early and transparently can help newsrooms as they weigh their options, especially given the broader background of already declining trust in mainstream media, where any factor that could further erode that trust must be considered carefully.
Mitigating AI news interfaces’ risks: policies and best practices
There are many reasons news organizations are incorporating generative AI into their web experiences, both financial and social. Online search experiences are fractured, with people finding their information through their smartphones and other digital means rather than traditional news media. News organizations must pivot to stay relevant, or even, increasingly, just to stay in business, and AI offers the promise of doing things faster, smarter, and cheaper. Notably, The Washington Post launched an “Ask the Post AI” chatbot the week after the US Presidential Election, complete with an enticing interface featuring a carousel of sample questions across the screen that users could consult to inspire a query. The trend of integrating chatbots as information discovery tools in news interfaces extends to a variety of other websites like Forbes’ Adelaide, the Financial Times’ Ask FT, and Dallas City News’ chatbot.
Given this rise in generative AI-powered tools—and our own learning with Spotlight PA—we offer the following considerations for developers, news organizations, or other public organizations working to integrate this technology responsibly:
Proactively educate users on question framing
Users benefit when they know what a chatbot is for and what it isn’t for. Clear, upfront guidance about the kinds of questions a tool can answer, its limitations, and the sources it draws from (such as the news site’s own reporting and archives or a larger web index) helps orient users and manage expectations. There are various ways to approach this, such as onboarding prompts, example queries, or contextual information within the interface itself. However, it is achieved, the point is this: by helping users frame more effective questions and understand the bot’s parameters, news organizations can help direct a user’s experience, which also helps manage their assumptions about the tool’s constraints and affordances.
Word non-answers carefully
When a bot declines to answer, how it responds matters. Users may interpret abrupt or vague refusals, or even redirected questions, as evasive. Or worse, as a result of some kind of editorial bias. A thoughtfully worded non-answer can mitigate this risk, which, at its core, is a risk of losing user confidence and attention. Instead of simply saying, “I can’t answer that,” a bot might explain why. For instance, “This question falls outside the scope of this tool, which is designed to provide information based on X, Y, and Z,” or something like “More context is needed to give a helpful answer—try rephrasing or narrowing your question in X, Y, or Z way.” Even when a bot is programmed to not respond, transparency about why is helpful context for users. In sum, even a non-answer is an answer. It conveys a value of sorts. And users may read into the non-answers in ways that erode trust in the bot, and by extension, the news organization itself.
Pay attention to translations
Offering live or automated translations can increase access and equity, but only if they are accurate. Even a small translation error could unintentionally misinform an entire group of users. This makes translation an especially important area to introduce human oversight, as it is not just a technical concern, but a trust and equity issue. We found this to be especially laborious and to require a great deal of human time in our work with Spotlight.
Red team with trust in mind
There are many ways to approach red teaming. One approach we found particularly helpful was to check for answers that may affect trust, even if they’re technically or factually correct. A response that feels evasive or vague or even oddly framed could leave users with the impression that something is being hidden or manipulated. Suggestions for follow-up questions that introduce new ideas that do not align with user inquiries might inadvertently confuse users and make them wary about what should have been a straightforward interaction. As you red team, simulate skeptical users; don’t just fact-check responses. Ensuring that you test how the bot handles sensitive topics, ambiguous questions, or emotionally charged queries is crucial. Consider how the system redirects, contextualizes, and clarifies as it interacts with users.
Conclusion
Before we partnered with Spotlight PA, we thought the risks of AI use in the newsroom were primarily hallucinations that lead to false information, or biased information due to training data limitations, both of which could severely harm the reputation of news outlets. Given what we found in our analysis, we would add emerging risks tied to shifting user behaviors, attitudes, and dispositions, as AI technologies increasingly mediate how people experience and emotionally interact with the news.
These risks are also not limited to AI assistants or bots used by news organizations. They extend to other industry-specific chatbots, and there have been documented problematic bot interactions in health and wellness, local government, and even banking. When a bot is created for a certain industry, even with safeguards, it can often be jailbroken if a user pushes the bot’s limits to generate harmful content, either on purpose or by accident. One of the most effective ways to reduce this risk is to build an AI-powered bot with strong restrictions, as we saw with Spotlight’s Election Assistant. However, when bots are designed to support a broader range of use cases and user queries, it becomes extremely difficult to prevent harmful outputs entirely, especially given the unpredictability of user behavior.
This underlines the need for continued research into user behaviors and changing information attitudes of news seekers. We would hypothesize that users who go to a newsbot for information prioritize both the ease and efficacy of a Q&A interface and the accuracy of vetted reporting. But we can’t know exactly what user intent is when typing in questions. This makes traditional auditing methods very difficult. Understanding user intent and a user base for a tool is crucial in making these emerging technologies and interfaces both trustworthy and effective.
The authors would like to acknowledge and thank their partners at Spotlight PA and the wider research team, including Beth Schwanke, Yu-Ru Lin, Lara Putnam, Rr. Nefriana, and Ahmad Diab.
Authors


