Home

Donate
Perspective

What the AI Safety Debate Can Learn from the Techlash

Antonina Vykhrest / Nov 6, 2025

Sam Altman interviews Mark Zuckerberg in 2016. (Source)

What if we could accelerate artificial intelligence safety and risk management by building on platform governance and online safety foundations? Could we compress decades of product and algorithmic safety lessons into the shorter timeframe that advanced AI diffusion demands and in doing so, prevent more harms?

As we watch the cycle of coverage around AI safety unfold, from congressional hearings digging into Meta's policy that allowed chatbots to engage children in romantic or sexualized conversations to lawsuits from parents of teens who died by suicide after interacting with AI companions, I recognize the pattern. This trend of public outcry at seemingly foreseeable tech safety lapses echoes the 2018 techlash, when tech companies grappled with a series of broad societal risks. The difference is that today, we are dealing with higher stakes and with less time to course correct.

After fifteen years bridging trust and safety and international human rights, from serving on TikTok's digital safety and platform governance teams to the Council of Europe's international policy and cooperation halls, I see an opportunity. Many emerging AI safety challenges closely parallel problems the trust and safety community spent two decades learning to address. Yet the conversations between the communities often happen in parallel rather than in productive dialogue that shares and applies lessons learned.

The rapid diffusion of advanced AI means we are compressing decades of technological adoption into months. We don't have time to repeat the longer learning curve of the internet and social media governance, or any prior technological revolution that took decades for mass adoption.

This isn't about claiming all the answers. Frontier AI models present novel technical challenges around alignment, interpretability and autonomous systems. But the evolving nature of digital and AI safety means we can’t start from scratch — we must build on the moments when prior fields achieved genuine “aha” breakthroughs in sociotechnical safety.

Trust and safety has grappled for decades with the safety of online communities and marketplaces and the operationalization of online policies drawn from sociolegal norms encompassed in human rights law. Product safety, algorithmic fairness, data privacy and even copyright and fraud, are all issues we’ve grappled with, ultimately arriving on shared frameworks, methods and learnings.

Five critical lessons emerged: treating safety as a sociotechnical issue requiring diverse integrated expertise in safety teams; embedding safety by design from inception; prioritizing transparency to maintain trust and encourage industry sharing; preparing robustsystems for inevitable unintended consequences; and anchoring affected communities in solution developmentwith continuous engagement with domain experts, policymakers, civil society and academia.

These are the greatest levers we have to meaningfully accelerate AI safety infrastructure and the processes and organizational capacity required to do so.

Understanding the techlash pattern

The techlash, Oxford Dictionary's runner-up word of 2018, marked a shift in how society viewed platforms mediating our social and political lives. In the early 2010s, social media and the internet were still celebrated, including during the Arab Spring, as a force of democratization. By the mid-to-late-2010s, crises cascaded: Cambridge Analytica abuse of user data, Russian election interference, incitement to violence in Myanmar and Kenya and mounting child safety concerns.

Three patterns emerged that matter for today: First, initial crisis-driven policymaking produced reactionary solutions, including laws that sought to regulate illegal online speech but ended up being excessively blunt tools like Germany’s NetzDG. Second, expertise gaps created misaligned interventions, with platforms maximizing for engagement alone without input from experts on how algorithmic feeds can impact radicalization or how certain features could negatively target teens. Third, it took years for tech companies to collaborate on safety. There were no formal mechanisms to share threat intelligence until the industry established theGlobal Internet Forum to Counter Terrorism (GIFCT)to identify terrorist content through hash-sharing databases despite years of ISIS content.

Why AI demands accelerated re-learning

Our current AI moment differs in ways that amplify urgency. ChatGPT reached 100 million users in two months, an unprecedented diffusion speed that compressed years of adoption into months.

The nature of potential harm is qualitatively different. Social media amplified user-generated content and created new risk vectors such as child grooming, harassment, misinformation and fraud. Frontier AI systems actively generate, decide and act. Agentic AI compounds risks by pursuing goals with potential loss of control scenarios. Catastrophic risk spans LLM-enabled cyber attacks, potential assistance with weapons development and critical infrastructure disruption. Algorithmic decision-making can encode bias at scale in sensitive domains such as hiring, healthcare and criminal justice, while model interpretability limitations create accountability gaps.

Current harms, algorithmic discrimination, facial recognition errors, chatbot-linked tragedies, AI-enabled fraud and model jailbreaks, likely represent only the earliest manifestations. The most devastating consequences are usually unanticipated.

There is also the complexity of AI developers versus deployers: frontier labs develop general purpose models, while downstream deployers create industry-specific use cases and products.

An opportunity to connect adjacent communities

AI safety encompasses a sprawling ecosystem: frontier AI labs, developers and deployers, policymakers, academia, enterprise risk managers and national security strategists, working on everything from technical alignment to export controls. This breadth has created fragmentation, where lessons from adjacent fields get overlooked. Fragmentation has also been deepened by opposing AI safety camps described as AI doomers and AI ethicists.

Encouragingly, voices are starting to raise visibility on gaps between trust and safety and AI safety, highlighting how AI challenges parallel product safety and social media lessons. It’s critical that these connections become more amplified and systematically included in AI safety and governance field-building.

Five foundational lessons from platform safety

1. Safety is sociotechnical

Platform safety cannot be solved by technologists alone. Effective programs require area experts such as governance experts, ethicists, sociologists and civil society representatives working alongside technical teams, integrated into decision-making, not just consulted.

Early content moderation teams built systems without always taking into account cultural context, resulting in both over-enforcement and under-enforcement. Global programs evolved, especially at large global platforms, to cross-functional teams where regional specialists and domain experts shaped more culture and evidence-based policy and intervention.

Organizations like the National Center for Missing and Exploited Children (NCMEC) brought together tech companies, law enforcement and advocates to develop industry-wide standards to fight child sexual abuse. The EU's Digital Services Act consultation convened diverse stakeholders.

For AI, similar integrated structures are needed where researchers work with domain experts throughout development and not just at red-teaming. Child development experts when designing AI for minors. Mental health professionals on teams building companions. Democratic governance experts shaping systems influencing discourse. The goal is organizational structures enabling meaningful collaboration with shared language and co-created solutions.

While frontier labs involve domain experts in red teaming, it’s important that these experts are integrated into internal teams. While this is currently the case for safety teams with Anthropic and OpenAI, it’s critical that companies that develop and deploy AI are meaningfully building sociotechnical expertise into their teams.

2. Safety by design over retrofitting

Retrofitting safety after deployment is exponentially costlier and less effective than building it in from the start. This shift from reactive to proactiverepresents one of the field's most important evolutions.

Concrete practices consist of conducting risk assessments during ideation to define what could go wrong, who's vulnerable and worst-case scenarios; including safety requirements in specifications, like functional requirements (must do X), performance requirements (do X within Y seconds) and systems safety requirements (must escalate for user approval); carrying out impact and algorithmic assessments before and after launch.

The contrast between applying risk management and maintaining minimum compliance matters. Minimum compliance asks: “What's the least we can do to avoid liability?” Risk management asks: “What's the most we can do to prevent harm while achieving objectives and what are the redlines?” Given AI's unprecedented reach, only the latter suffices. This is a lesson that Character AI’s teams should have learned several iterations ago. Safety has to be built in from the earliest stages. It’s not an add-on, but a foundational architecture.

3. Transparency builds trust

Companies have essentially one chance to build public trust around new technologies. Once broken, it’s exponentially harder to rebuild. Minimizing disclosure of problems to avoid liability inevitably backfires. The question isn’t whether failures become public, but whether companies disclose proactively (building trust) or get exposed through leaks (destroying it). Meta’s leaked chatbot policy on romantic conversations with minors illustrates this failure.

Meaningful transparency consists of publishing detailed safety testing results, including failures, with diverse experts involved, critical issues found and what wasn't fully resolved. Other transparency practices include disclosing mitigations, educating users about the risks, regular transparency reporting, incident disclosures and third-party evaluations.

This doesn’t mean reckless openness on model vulnerabilities, but measured transparency that builds accountability and trust. Labs like Anthropic lead through model cards, a transparency hub and collaboration with third-party evaluators like METR. While a lot more is needed to advance the state of frontier AI transparency, evaluation and assurance, it is critical companies lead with this.

4. Prepare for inevitable unintended consequences

Platform governance taught us that broader harms can emerge from any technology at sufficient scale. The question isn't whether AI causes unintended harm, but how quickly we detect and respond.

This demands monitoring systems for novel harm patterns and identifying the ever-evolving harms and bad actors; evolving holistic incident response including technical fixes, policy updates, communication and user support and sharing lessons across ecosystems and with the industry when novel patterns are discovered, including through dedicated coalitions.

Perfection is impossible, but preparedness is achievable. We can build systems that detect, respond to and learn from harms as they emerge.

5. Anchor affected communities - participatory approaches are essential

The most effective safety measures emerge from including feedback from groups potentially adversely impacted. This is critical for AI systems affecting users in highly individualized ways.

Current AI safety relies heavily on technical expert red-teaming. This is valuable but insufficient, missing linguistic representation and risk categories obvious to people with lived experience of vulnerabilities these systems might exploit.

Participatory design means co-creation design workshops where affected communities help design features, policies, safety mechanisms from earliest stages; community review feedback assessing new features before deployment; compensation and respect for expertise of people with lived experience of marginalization who possess valuable expertise deserving of appropriate compensation and authority.

The goal isn't adding participatory steps but fundamentally restructuring how AI systems are designed to center perspectives of those most likely harmed.

Moving forward together

The current moment presents a choice. We can repeat techlash patterns, vacillating between optimism and panic, implementing poorly-designed approaches, allowing preventable harms to accumulate. Or we can build on lessons already learned, which requires doubling down on collaboration.

AI companies should implement safety by design throughout development, invest in safety teams with meaningful and diverse domain expertise and prioritize transparency and third-party evaluations.

Trust and safety professionals should engage with AI safety communities, translating platform governance lessons into AI-specific frameworks and offering experience about building safety at scale.

Regulators should develop and solicit technical and operational expertise before implementing rules, create iterative frameworks evolving with technology and require transparency enabling external accountability.

Civil society and academia can promote evaluation methodology, assess harms while constructively engaging in solution development, pushing for participatory approaches centering affected communities.

We're at a critical juncture. The systems and policies established in coming years will shape not just AI development but our broader relationship with technology.

The techlash taught hard lessons at high societal cost. Stakes are now higher — too high for panic and misguided solutions, but also too high for overlooking preventable harms. The question isn't just about AI's future, but whether we can govern transformative technologies responsibly.

Platform governance expertise exists and is available. What's needed is connection between AI safety researchers and trust and safety practitioners, between technical capabilities and operational excellence, between communities solving adjacent problems in silos.

The lessons are there and if we apply them, we can reduce harms that may otherwise go undetected until it’s too late.

Authors

Antonina Vykhrest
Antonina Vykhrest is a former Director of Trust & Safety and Accessibility at Rockstar Games with over fifteen years of experience in online safety, platform governance, and international human rights. She's currently a Research Fellow with the French Center for AI Safety (CeSIA). She previously wor...

Related

Perspective
The AI Safety Debate Needs AI SkepticsOctober 6, 2025
News
EU Weighs Regulating OpenAI’s ChatGPT Under the DSA. What Does That Mean?October 29, 2025

Topics