Broken Code: A Conversation with Jeff HorwitzJustin Hendrix / Nov 15, 2023
This episode explores Broken Code: Inside Facebook and the Fight to Expose its Harmful Secrets, a new book by Wall Street Journal technology reporter Jeff Horwitz. His relentless coverage of Meta, including first reporting on the documents brought forward by whistleblower Frances Haugen in the fall of 2021, has been pivotal in shedding light on the complex interplay between social media platforms, society, and democracy. I spoke to him about his journey, new details revealed in the book, and the impact his reporting has had in driving platform accountability both in the United States and internationally.
A transcript of this discussion is below.
Jeff, this book is a journey through the last few years, even the early years of Facebook certainly, but it's also a bit of a story of your own journey. You start with talking about how in your reporting, particularly on politics, political accountability reporting, you were feeling that your work was pointless, that so much bad information was going viral, that the political situation in this country was degrading, that in many ways you're reporting, you say, "It felt like a weak attempt to ride bullshit's coattails." Start us off with how you came to cover Facebook and what was going through your mind when you first got that beat.
Yeah, so you're going to have to ask the guy who hired me, Brad Reagan, about how I came to cover Facebook, because I would not have hired me for this. I was a financially-minded investigative reporter who'd been at the Associated Press for four years. The 2016 election was a very big thing for us, and yeah, there's a lot of painstaking financial records work, things about Mar-a-Lago insurance claims, hush money payments, et cetera, et cetera, right? Paul Manafort's deals in Ukraine, all of that sort of stuff.
And it was hard to shake the feeling that it truly did not matter, that the stuff that I did that did get attention, and there was some of it that did, was just basically traveling in one partisan echo chamber and that everything else went, just died. This sort of seemed a little hopeless, and I think that there are a lot of explanations for what has gone wrong with the political news information ecosystem, and cable TV is a big part of it.
But it seemed pretty apparent that even the cable guys were taking their cues from what was going viral on Twitter and so forth, and I think I was despairing of the idea, not of impact, it's not like the world's supposed to bow down 'cause you've written a story, but the idea even that information was being functionally transmitted and it seemed like also the center. And I candidly, I didn't understand how it all worked and the center of power, and it was obviously not in D.C. or New York, it was in San Francisco and the Peninsula.
So why Brad Reagan, why they hired someone who had no experience covering tech to do this job remains a mystery to me, but they did. I then spent my first year just trying to figure out how this stuff worked even remotely and, "Okay, what exactly did the Russians do and how did it matter? Did the Cambridge Analytica people actually accomplish anything or were they just snake oil salesmen?" So all that stuff, yeah, I had no idea about these answers.
But your reporting has had a huge impact. And how many congressional hearings have now taken place as a result of the papers brought forward, first by the Wall Street Journal from whistleblower Francis Haugen? This book, I suppose is almost I guess the marking of a chapter in your life and a chapter in Facebook's existence as well.
A long chapter in my life. Look, obviously the book began with, "Okay, I have north of 20,000 pages of documents that are in unredacted form. I can contact the people who were behind the work." And I think something that Francis Haugen was very upfront about is that she took a lot of work from places around the company that she didn't have direct experience in. So, she was on the Civic team and had, I think, some expertise related to Civic work and to international manipulation attempts, but that's a reasonably narrow focus, and she grabbed documentation of issues that arose all over the place on that platform.
And so part of the question was, "All right, how did this work come to be and also how can I explore it in a way that actually gets into the story of how this bizarre unique body of knowledge came to exist?" Because I think as the company itself acknowledged, to some degree, one wouldn't create things like this if one thought they were ever going to see the light of day.
So this is a book that peers into this colossus, Facebook, but certainly peers mostly into the integrity work that's going on there, both the science of that, but also the policy piece of that, and the extent to which those things seem to always run up against management, always run up against the kind of corporate interest, the profit incentive that the company is following.
You write late in the book, "The story of Facebook's integrity work is, in many respects, the story of losses." I wanted to just start with that. What are some of the key losses in your mind? Where would you start?
Some of them began before the Integrity team was ever formed. As is laid out in the book, there was a big push post-2016, and rightly or wrongly, I think that to some degree there probably there should have been a realization earlier, but for whatever reason, the election of Donald Trump was a thing that for people inside the company, it wasn't so much like we should have tilted the levers, we should have adjusted the platform in some way to screw the guy over. It was that they just didn't even believe their platform was compatible with his victory almost.
And so I think the idea that everything that Facebook does is always for the best and, "We are improving, clearly we are improving political discourse, we're improving at [inaudible 00:07:12] discourse." That was the first time that a lot of the employees, and executives to some degree, started wondering whether in fact they might have misstated that case. Go back and if you think about Mark Zuckerberg in the aughts saying that, "Facebook's spread in the Middle East was going to actually end all terrorism because how could you hate people if you were connected to them in," well, that didn't work out very well. Right?
So there was, I think, this very pollyannaish sense of the product originally, and a sense that it was really overwhelmingly making people better and forging tighter social fabric and so forth. And I think that sense was in some ways a cover for, shall we say, a lack of curiosity into the actual mechanics of the product or what was functionally happening. I think something that absolutely floored me when I finally realized that the date of it was the introduction of the first functional classifiers for hate speech as guardrails. This is just an automated system that is going to check when a growth-related Facebook experiment is altering the platform in a way that is good for growth, that spikes hate speech, right?
And the date of that was in late 2019. We are at this point in a fully mature product, right? The Facebook of late 2019 and the Facebook of today are very similar election platforms with very similar overall mechanics. They get tweaked all the time, but the feature set, the general feed, the approach to ranking, not much has changed here. And so by the time that I think people even started thinking about what amplifying user generated content meant and what the effect of optimizing for particular company-related metrics meant, it was almost too late to change. Nobody would've originally said, "Hey, should we introduce a feature that is going to cause misinformation to rise on a slow exponential curve as information travels further and further?" There might've been some hesitation about that, but you know what? That's called the reshare button.
So it's, I think, a really difficult thing, and candidly, I have a lot of sympathy for the idea that you wouldn't necessarily understand what the ramifications of the product were going to be or what the ramifications of the constant tweaking were going to be. I think there was a point, and I think this was in the late 2019 point, when they finally developed these guardrails for growth things, they realized that, actually, all the growth related activity in the last six months had been actively undoing any of the integrity-related work they've been doing in ranking because it was just literally working at craft purposes, and whoops, they'd never known that before.
The company did hire a ton of data scientists, behavioral economists, machine learning experts, social scientists, computational social scientists, et cetera. In fact, I remember at one point somebody saying to me, "There must be more social science going on around questions like polarization for instance at Facebook than perhaps at any other institution anywhere in the world." Why is that disconnect so severe between what appears to be some real well-intentioned signs going inside of the company and management's own understanding of exactly what it is that they're building?
A lot of it, I think, really does come down, particularly early on, to Chris Cox. Mark Zuckerberg has never been, I think, super wild about governance stuff. Some of the criticisms of him is that he's trying to control speech online. No, Mark Zuckerberg desperately does not want to control speech online. In fact, finds the entire topic to be boring, depressing, and it doesn't scale.
That said, that post-2016, look, Mark Zuckerberg's 2017 commitment was to visit every state in the US. He was drinking milk straight from a cow in Wisconsin and stuff like that, while there were a lot of teams getting spun up by folks including Adam Mosseri and Chris Cox that were focused on misinformation, on trying to figure out what the actual ecosystem was.
And there was, I think I would call it a sizable boomlet of hiring in the wake of 2016, they really threw gas on the employment fire for Integrity staff, was in response to the sort one, two punch of Russian interference/Cambridge Analytica, and they started viewing this stuff as potentially existential, particularly when it began to tank the metric known as CAU, which is short for Cares About You.
And at that point, it was blank check time for a while on integrity work, which is that they recognized that they were taking a lot of incoming, and I think this sort of really... Even if some of the Cambridge Analytica and Russia interference stuff didn't really hold up all that well long-term, the sort of absolute drubbing the company got did drive the hiring.
So I think it's an interesting thing, 'cause to some degree, it was a little bit forced and panic-driven, but at the same time it was also, yeah, they ended up with a massive team of extremely intelligent social scientists, engineers and data scientists who were working with data that is unimaginable in any other social science context. And so it's not surprising that there would be things that were learned clearly and definitively in a way that just don't exist anywhere else.
You have a thesis about how Facebook works and you're on Facebook's core data science team, great, run a little experiment, pull out a 10th of a percent of all users in one country and enjoy running an immediate social science experiment on 50,000 people. This is like a truly amazing capacity they had. And so yeah, they learned quick.
And you of course have broken a lot of headlines based on finding out what the results of that research are, even if the company didn't want to share it publicly. I do want to jump ahead a bit in the book, perhaps to chapter 13. This is when we're starting to get into the 2020 election and its aftermath.
Of course, there's great concern going into 2020 that there may be civil unrest, there may be violence, lots of folks are beginning to kind of scenario plan. What if Donald Trump carries on with these threats that he might not respect the outcome of the election, et cetera?
But I remember this moment around Labor Day and before the election, where Mark Zuckerberg I think told Mike Allen at Axios that he felt there was a real heightened risk of civil unrest in the period between election day and then after the results are called. This sort of expectation that, for a variety of different reasons, the result wouldn't be known.
And I remember thinking at that time, "We should be listening to this guy. He's got perhaps a better dashboard, a better understanding of the general discourse than anybody else on the planet." What do you think was going on in Facebook in those moments, preparing for the election? What's important for the listener to know about how that various, that sort of social science, that apparatus, that integrity apparatus that had been built and invested in after 2016, where had it arrived by the summer, the fall of 2020?
This is the interesting thing, is that that was in the late in the game, "Oh crap, things might be more serious than we thought” stage, I would say. Because if you go back to Mark Zuckerberg circa October 2019, you will see Mark Zuckerberg basically saying that, "Any type of moderation or alteration to the platform that wouldn't fly on the open," basically comparing Facebook to the open internet and suggesting that there was no more moderation of it or regulation of it required than applied to websites, functionally, right?
I think the concern inside Facebook that things were going very south, definitely long predated that Mark Zuckerberg appearance. In the fall of 2019, the Civic Integrity team produced a chart basically showing that company's preparedness for foreseeable problems in the election, and the thing was bad enough, they had to make up two colors of red. This was not a popular position.
And in fact, in the spring of 2020 when the company was actually starting to realize how bad Facebook groups could be. And keep in mind, Stop the Steal, and some of the things that came after would really reinforce that. Mark Zuckerberg was actually on an anti-groups moderation push. He was very concerned about over enforcement, because it turns out that someone had taken down a group that Mark Zuckerberg was a member of. So basically as the rest of the company is like, "Oh God, we've got a problem in groups," the CEO saw a different problem, which was over enforcement.
Now, I think in the end, the data won out there, and by the summer and fall of 2020, there was this extremely hurried effort supported by Guy Rosen, the basically head of Integrity at the time, and a lot of others to try to clean this thing up, but it was a pretty late in the game realization that they were going to need to batten down the hatches, if that makes sense. The Integrity work on Civic stuff had been stalled out for much of 2019, and I think everyone understood they were behind the eight-ball.
You point out in the book that when it comes to groups, you say, quote, "A lone user could and did issue 400,000 invitations to QAnon groups over the course of six months." Just an extraordinary superpower to put in the hand of motivated extremists.
This is a sustained issue that has occurred within the company, is that Meta really hates the idea of overuse of their product being a problem. They have engineered the platform so that overwhelmingly, and they've made a few little tweaks to this in the last couple of years, but only a little bit, that a user who takes 100 actions over the course of an hour, writes 100 comments, has 100x the influence of someone who writes one.
So it's like a voting machine, in which as soon as you're done voting, you can get back in line and vote again. Or really, there isn't a line, honestly. You can just keep on just pressing that lever like a slot machine addict in Vegas, right? That is all engagement and the company has always been deeply uncomfortable with attempting to limit the impact of hyperactive use.
And so things like how many group invitations can you send a day? What's the rate limit for that? Or how many friend requests can you send a day? They just never wanted to establish a limit, even when there were indications that things were going very south.
One of the more tragicomic documents in the whole collection that Francis grabbed was one in which a data scientist from the Friending team, so responsible for creating as many new friend connections as possible, loaded his colleagues that half of all friending activity was coming from people who were sending more than 100 requests a day.
From my point of view, as soon as I thought that to myself, I thought, "Holy hell, no one has ever legitimately sent 100 friend requests in a day. That's not how friendships work. That's spam." And that was like, it turns out that's-
Yeah. And it turns out that's what they'd been optimizing for. Literally, they had been building their tools to encourage the people who were doing 50 invites a day to boost it higher. Because it turns out, it's far easier to turbocharge a set of motivated users to build the tools for those guys to go even more crazy, than it is to get other people to engage in what would be called legitimate use, maybe.
The funny thing is, after this guy recognized this thing, which is something, if I were running a platform, would probably make my blood run cold, everyone was like, "Well, are we sure it's a problem? Show us proof that it's actually a problem. Maybe people in the developing world just have different concepts of friendship." Which is very open-minded, but also a little half-assed.
Well, if perhaps there was an enormous focus on technically enabling groups to grow, there appears to have perhaps been less focus on applying technical fixes when bad things are going on. I'm struck by a bit of the social media report that ultimately emerged out of the January 6th committee. Not that it was published formally, but that was leaked slightly afterwards.
Well, in that there was this little note, that essentially, when it comes to groups, there was some kind of technical error that was taking place on Facebook sometime in the timeframe of 2020. Ultimately fixed, the report says, in October of 2020. But this says that basically groups were not receiving strikes for violence and incitement for months.
When finally that was fixed, it resulted in an immediate strike being doled out to 10,000 groups, and at least 500 were almost immediately removed. Meaning that there were thousands and thousands of groups where people were literally breaking Facebook's own policies, and because of some technical error, Facebook wasn't doing anything about it. So, I'm struck by that, just irony. Everything's been done to enable the function of these groups, but any of the brakes, for whatever reason, seemed to have been coming off.
This is just so very common. The, "Oh, whoops," the technical flub thing, it is a recurring theme of the book, right? 'Cause here's the thing, these guys are very smart. I think they are good engineers, I don't know that they are... Look, it's not a particularly big surprise that at a company that never really gave up the motto, "Move fast and break things," that things frequently broke, right?
Another thing, and you could have mentioned this just as well, was the company let through 8 billion bad viewport views, aka, impressions, between the few days after the election and January 20th, 2021, because they just hadn't noticed that a system for filtering highly viral, highly popular, but violating posts had just broken. Whoops.
Not all of those 8 billion, in fact, only probably a minority of them were actually US politics-based. This was a worldwide failure, but things like this just happened all the time, and I think the engineering staff frequently was pretty frustrated with it, in the sense that there were many notes written by people talking about how the way you got ahead at Facebook was by being an arsonist fireman, aka, that you build something that catches on fire immediately and then you get credit for putting it out.
And so this is just, I think, a cultural thing. And in some ways, it's almost like the downside of what made Facebook so successful, which is that I think Mark Zuckerberg very correctly realized that social media was a completely new world and that there was a land grab to be had. And I think an entity that really thought carefully about introducing new features, rather than just looking at a chart and saying, "Number one up, launch it," that entity would not have won the dominance in the industry that Facebook did. And so in some respects, it's not surprising that they had a hard time giving that up. It worked so well.
There are a lot of details in this book that I find myself thinking, "Wait, did we already know that from Jeff's reporting, or is this brand new?" One of them was the singular role that Mark Zuckerberg played in ordering election delegitimization monitoring to immediately stop after election day in 2020. That was a decision not taken by some committee or some group in management, it was taken by Mark himself.
That one is new. There's definitely some reprise of previous stuff, but that was not to my knowledge anything that's been written before. Yeah. No, this is, I think, a really interesting thing, and I touched on in that Georgetown speech, is that it's almost as if Facebook’s CEO kind of struggles sometimes with understanding network dynamics and understanding that you actually have to deal with communities online. You can't just build a tool and somehow it's going to be a neutral waiting system. You have to look at how people are trying to exploit your content ranking and recommendation systems. You have to look at how they behave, right? Whether certain groups such as the anti-vax movement is, shall we say, engaging in widespread manipulation of engagement to promote their particular point of view.
And it just didn't go over well, so in this instance, Mark, immediately after, basically as soon as the election was called, said, "Not only can you not shut down election delegitimization groups, I don't even want you studying it." This was conveyed to leadership and to the Civic Integrity team. And in fact, the company went deliberately blind on this. And I think this is something that actually Samidh Chakrabarti, the head of Civic, wrote in a memo to HR, is that he had been told by Guy Rosen that paying attention to what election delegitimization was happening on the platform would just put pressure on Facebook to do something about it, and that Facebook did not want that pressure.
I want to talk a little bit about one of the measures that Facebook does use when it tries to vaunt the success of its moderation efforts, which is around the prevalence of certain material that it seeks to remove. Hate speech is one, of course, other things, child sexual abuse material, other types of concerning material. The company comes back again and again to the idea that it's been able to reduce the prevalence of some of this material down to what it believes is a reasonable number.
The book is bookended, I should say, by mention of a particular individual, Arturo Bejar, who just happened to appear, I think, for the first time on your pages as yet another whistleblower coming forward, and just also testified in front of the Senate earlier this week, during the week I'm talking to you. Arturo really brings into question the way that the company thinks about prevalence, which I think is very important in many ways. It seems like something that more folks should be paying attention to as a way of helping them to understand how to evaluate claims from the senior executives at Facebook.
Yeah, absolutely. So first, just to address something, the timing of Arturo Bejar going public in The Wall Street Journal and then testifying in front of Congress, I want to promise you that is not based on the book getting released in short order, because I hate it when people do that. And I can assure you, The Wall Street Journal is just very slow. We take our time.
I certainly was making no suggestion that was the case.
Good. But that said, I think, look, prevalence makes a certain amount of sense, right? It is just the, "Okay, what can we measure, and basically of all the things impressions served on Facebook or Instagram, how many of them were bad, divided by how many overall impressions were there?" It's a logical metric to use for some things, but it fails in a whole bunch of different problems.
First, it assumes that you are perfectly able to define what a problem is in an automated system, which is damn near impossible, right? If you have a rule against calling people a particular word, then you can just call them that word, plus a star in the middle, and congratulations, it's no longer hate speech, right? This question first of what you measure, and then the question of also which communities is it traveling in? So, misinformation can be a reasonably low prevalence thing, but if all of that is going to a small subset of the population that is absolutely convinced that Tony Fauci is trying to murder them and their children, I think you might still have a problem.
Now, another thing that I think about the prevalence issue that sort of arose, is that just in the end, people like Arturo were very good at demonstrating, I think, that prevalence was not catching the user experience, that prevalence had nothing to do with whether or not people were sending harassing DMs, that prevalence had... I think one of the more poignant ones, is that when he came back to the company as a consultant, he ran this survey called BEEF, Bad Emotional Experience Feedback that asked just simple questions like, "In the last week, have you been subject to unwanted sexual advances?" And the answer for that, for the overall user base, was uncomfortably high, and for teenagers it was one in eight in the last week, teenagers under the age of 16, I should say.
So here you have Meta putting out statistics saying that sexual exploitation on its platform is so rare that it cannot even be measured. And you have one in eight teenagers saying, "Oh yeah, no, definitely someone propositioned me/was hitting on me in a creepy fashion in the last seven days." And I think Bejar's position was like, "Guys, you got to wake up. You have built a system that is telling you the numbers you want to see, rather than anything approaching the reality of what users experience."
I want to talk about one of the individuals that you do discuss in the book who's constantly a mouthpiece for these types of figures, who's often the person tasked with explaining research about Facebook to the world or explaining how these figures make sense, which is Nick Clegg, who joined the company perhaps in the midst of the scrutiny around Cambridge Analytica and Russian disinformation and the Trump years back in 2018.
In the book, you talk a little bit about the disconnect sometimes between the research that's happening in the company and the company's researcher's understanding of certain phenomena with regard to Facebook and Instagram and their effects on people in the world, and what ends up coming out of Nick Clegg's mouth when he speaks at a kind of higher level often to elite audiences or the media. How does that disconnect happen in your view?
Look, Nick Clegg is like a lot of us, I think a generalist. And I think he's a good guy and effective spokesman, so this is not a criticism of Nick directly, but yeah, it seems like the company really sort of liked to confuse itself with the open internet, which is, "Well, of course there's always bad stuff out in the world, and particularly on the internet. How could there not be on Facebook?" Absolutely true.
The company absolutely hated getting into any conversation that involved what control it had over the platform, and in particular how choices that it made might affect the frequency with which bad things occurred on the platform, like, "There's bad things offline, and of course that's mirrored online." Totally true, right?
And I think for a long time, the company got away with that line of thinking, which is that, "At worst, all we're doing is promoting the stuff that you want." Right? So, Clegg put out a note at one point titled It Takes Two to Tango, basically suggesting people are responsible for their own feeds.
But at the same time, there are a number of ways of figuring out whose interest Facebook was serving. And I think one of the interesting things is that internal research into what users wanted routinely found that users did not want Facebook to make a lot of the choices it was making. They wanted them to do more in terms of only serving up information from reliable sources. They wanted the company to take a more active hand in enforcement. There were a lot of things that people were asking for the company to do.
Obviously, the people who were in an ecosystem that effectively abused Facebook did not like these approaches and screamed bloody murder whenever anyone tried to do it. But the company just tried to basically absolve itself of responsibility for the design choices and curatorial choices that it was making, under the premise that the platform was, as a piece of technology, inherently neutral.
One of the things that you focus on in the book is how some people who came out of this period, this intense focus on integrity issues, all the science, et cetera, they've now, in many cases, left the company. They're out in the world making various types of proposals about how to do social media better and differently. You mentioned, for instance, the Integrity Institute. You mentioned that apparently the launch of the Integrity Institute was not well received within Facebook.
Is there a kind of silver lining in the fact that even if just for a brief moment this kind of focus existed inside the company, that it spun out this sort of set of people who are thinking about these issues in a more dynamic way, and perhaps maybe down the line that may lead to something better?
Oh, no, this is, I think if anything, if I had to think about the most positive thing that came out of all the Facebook files stuff, it's that some of these people started speaking up. Back in 2017, if a regulator in Europe or anywhere wanted someone to explain the dynamics of re-shared content, it would've been complete crickets. Nowadays, I can rattle off a dozen names of people who I would be confident could walk through that technically with extreme precision and clarity.
And I think, yeah, so Facebook hired a ton of extremely intelligent, talented people. It trained them on data that unlike any the world has ever seen before. It then thoroughly pissed them off and demoralized them in many cases, then it laid them off under the more recent Year of Efficiency.
So, there's been this body of technocrats, people who actually understand the mechanics of Facebook and Instagram. And I think particularly for that company, it's especially interesting because Facebook has basically grabbed every design feature that it possibly can from every other platform. It has all of the problems in a way that are, I think, very educational if somewhat unfortunate. People sort of really get the dynamics. They get how various product features interact, and they started speaking up.
And I think, look, if we're talking about things that the Facebook files fundamentally changed, I think the only thing that happened is that they called off, I'm sorry, they paused Instagram Kids on a long-term basis, right? Other than that, all the promised regulation, at least in the US, didn't come to be, at least it hasn't yet.
And in Europe, there's DSA, which I've been told has had some influence. But if I had to say about the one thing that sort of feels optimistic about all of this, is that there are at least people who are functional guides to this topic who are out and about now getting grant funding, setting up organizations, and that's neat to see.
Jeff, my last question to you is, do you reckon you'll ever get to talk to Mark Zuckerberg again?
I would say I have a very cordial and professional relationship with comms. I don't know, professional might be an overstatement 'cause it involves me, but we have a good relationship. And I have noted that Mark Zuckerberg interviews have not gone to The Wall Street Journal in recent years. I don't know that's going to change, but that could change any day. And if Mark is listening and wants to pick up the phone, I'm here.
One of the conclusions of this book is that the company ultimately didn't... It learned what it was doing, it saw the downsides, it quantified them in very rigorous fashion, and then chose not to change a lot of things. And I can understand why that would be an unpopular point of view if this thing were your baby.
Well, I'm not certain if you've yet been able to slake your appetite for accountability, that perhaps you came into the job with, but certainly you have produced an enormous amount of scrutiny on one of the biggest companies in the world and some of the most important questions about the relationship between tech and democracy. So, I thank you for that work, and I thank you for this book, which is called Broken Code: Inside Facebook and the Fight to Expose Its Harmful Secrets by Jeff Horwitz, out from Doubleday. Jeff, thank you.
Thank you. And I would put it this way, I'm even more surprised than I think anyone else is that I'm not sick of this company yet. It's bizarre. I definitely assumed when I was done writing this book, I was like, "That will be the end of that." And weirdly, no. So, thanks so much for having me.