What’s Going on With Twitter’s Trust & Safety Policy?John Perrino / Dec 7, 2022
John Perrino is a policy analyst at the Stanford Internet Observatory.
While the “Twitter Files”– the release of internal communications related to the platform’s decision to limit the propagation of a New York Post story about Hunter Biden – have been sucking up most of the oxygen in the Twitter news cycle, there is a growing dispute over whether hate speech and child abuse content is surging or on the decline since Elon Musk purchased the company.
The status of trust and safety efforts at Twitter have become a “he said, she said” situation with Musk, advisors, and even some victims advocates claiming unprecedented progress on countering slurs and abuse while journalists, civil rights advocates, and researchers warn the content is still easy to find and has increased in frequency.
There are a few key issues at the heart of the dispute:
- Whether trust and safety “success” should be measured by the prevalence of harmful content or the number of people who see it;
- Transparency about whether and how policies are being enforced under Musk’s Twitter 2.0 policy of “freedom of speech, but not freedom of reach”;
- Unilateral decision making under Musk’s ownership and a lack of staff and resources after massive layoffs and cost-cutting measures;
- Musk’s erratic behavior on Twitter.
Under Musk’s ownership, Twitter does still appear to be making a concerted effort to address some categories of harmful content. A lack of transparency and diminished collaboration with civil society likely holds back those efforts and leaves gaps in enforcement.
As University of California, Berkeley Center for Human-Compatible Artificial Intelligence researcher Jonathan Stray points out, both the critics and Musk’s crew can claim they are right about safety efforts depending on the data and measurement of success. More transparency and collaboration could move these efforts in the right direction.
The Twitter 2.0 Trust and Safety Debate
On Friday morning, news reports covered research from advocacy and civil rights groups that found hate speech slurs were more prevalent on the platform. Musk claimed that the data actually shows an overall decrease in hate speech on the platform since his acquisition and said the Twitter safety team will publish weekly reports on the data going forward.
Previously, some child and sexual abuse victim advocates praised Musk for blocking the most popular hashtags used to promote and sell child sexual abuse material (CSAM) and for making it easier to report that content. However, investigations by Forbes technology and cybersecurity reporters found that the reporting process took longer due to a lack of staff and that CSAM could be easily found by searching for terms without hashtags or by running similar searches on the platform.
A Case of Known Unknowns
What is clear is that staffing has been decimated at the company around the world, and as a result enforcement is changing for platform policies. The team responsible for enforcing a ban on child sexual abuse material across the Asia Pacific region, for instance, is reportedly down to just one person.
We also know that Musk is unilaterally making trust and safety decisions, rather than deferring to established policies. “A Twitter whose policies are defined by edict has little need for a trust and safety function dedicated to its principled development,” former trust and safety lead Yoel Roth wrote in a New York Times op-ed about why he resigned.
Twitter’s new head of trust and safety, Ella Irwin, told Reuters on Friday that the platform is relying more on automated content moderation and prioritizing restrictions on the reach of certain content over human review. On Saturday, information security researcher Andrea Stroppa said he has worked with Twitter’s Trust and Safety team since Musk took over to take “more aggressive” action to detect CSAM, with suspensions nearly doubling.
What’s missing is the collaboration Musk once promised with outside experts with different viewpoints and expertise. In an interview with Kara Swisher at the Knight Foundation’s Informed conference last week, Roth, the former Twitter trust and safety lead, hinted that idea may never have actually been serious about launching a content moderation council and flatly stated that Twitter is less safe today than it was before the acquisition.
Outside researchers don’t have access to the impressions data Musk shared in a graphic on Twitter to show a decrease in hate speech. Data available to researchers also “does not include information on any actions Twitter has taken to limit the reach of content,” according to CNN reporting.
User empowerment and the idea that “free speech does not mean free reach,” as my colleague Renée DiResta has argued, are powerful principles, but can we take Elon Musk’s word for it that hate speech targeting racial, gender, religious, sexual identity, and vulnerable populations is under control? The company had profound problems with this material before Musk bought it; there is little reason to believe he’s solved this complicated problem, especially after firing or accepting the resignations of most of the people responsible for it.
A Twitter where policy decisions are made via Twitter poll and enforcement changes made by tweet or quiet website updates suggests there will be more, rather than less, uncertainty and bias in Twitter’s approach to trust and safety policies. That is precisely the problem Musk once claimed he would fix.
In the year ahead, Musk’s Twitter will likely need to comply with EU regulations that require researcher access to social media data, internal risk assessments, auditing, and prioritizing trusted civil society groups that flag harmful content on the site. Musk said Twitter will work with European regulators, but given his pattern of shrugging off regulators, only time will tell. The clock is ticking.