Shedding Light on Google's Dark SideJustin Hendrix / Jan 8, 2023
Imagine a company that hides who it works with and where billions of dollars flow around the world. That earns its profits financing a global network containing piracy, porn, fraud and disinformation, even doing business with figures sanctioned by the U.S. Treasury, including Russian companies that may access and store data about people browsing websites and apps in Ukraine, potentially opening a mechanism for Russian intelligence to target individuals there. A company that tells the public that it doesn’t make money from guns that nevertheless does business with the maker of the AR-15, the weapon used in so many horrific mass killings, including the recent massacre of teachers and students in Uvalde, Texas.
Is this some organized crime syndicate or shady offshore shell company? No, it’s Google, one of the biggest and most prominent technology companies on the planet.
This episode features a conversation with Craig Silverman, a journalist who has spent years uncovering fraud in the opaque world of digital advertising and media manipulation. With his colleagues at ProPublica, in a recent series of articles Silverman employed a unique investigative approach to uncover just exactly how Google operates in a shadowy realm of deceit and disinformation. Headlines in the series include:
by Craig Silverman and Ruth Talbot, June 14, 2022,
by Craig Silverman, July 1, 2022
by Craig Silverman, Ruth Talbot, Jeff Kao and Anna Klühspies, Oct. 29, 2022
by Craig Silverman and Ruth Talbot, Dec. 21, 2022
What follows is a lightly edited transcript of the discussion.
We're going to spend a little time talking about your reporting on Google over 2022, four particular stories that we'll get into. But I wanted to just, for any of my listeners who aren't necessarily familiar with your career, your background, I think of you as sort of one of the OGs of digital investigations, and sort of really digging into how these big platform companies work, how ad tech works, how the technology works, but also the economics of the ad tech ecosystem. And I also think of you as someone who's been very kind to share your knowledge and your approach with a variety of my students in the past, as well as the broader journalism community. But in your own words, how would you describe your beat?
For me, the beat that has kind of emerged for me– and it's been now almost two decades, which is the crazy thing for me to think about…. a friend, Joan Donovan at Harvard, calls me Old Man Disininfo, which is not the nicest nickname I've ever had, but I guess it's kind in the sense that I've been at this for a little while. I think of what I do as really investigating the digital environment, but specifically around media manipulation. So the information and ways that we interact and communicate in this new environment dominated by big social platforms, run by big tech companies, mobile phones, all of the new ways that we're communicating in a much more democratized and open media environment, which has created so many opportunities for our voices, but also creates many more opportunities for manipulation.
And so I'm constantly fascinated by the devious and clever ways that people are manipulating this environment, whether it's on technical levels, or just for spreading false and misleading content, which again, is something I've been looking at in investigating for quite some time. I'm kind of an old school media blogger, and that's something that I've carried through now for almost two decades, as our media environment has changed dramatically during that time.
And I think it's fair to say that you are one of the best at using digital investigative techniques, and sort of digging slightly under the hood of how some of these techno systems work, or technical systems work, in order to get at details that other journalists might not.
That's a nice compliment, and what I would say is I'm pretty obsessed with that. I am constantly obsessed by finding new ways to just pop under the hood and see what's going on, finding new ways to make sense of this massive amount of data as people communicate back and forth, and the minutiae of that, the interactions, the likes, the shares, I have really been fixated on ways of trying to get my arms around the scale of this activity that we're dealing with, because to me, that is one of the defining challenges and elements of this information environment is just scale. Billions of people are on platforms, the amount of content and interactions happening at any given moment is so big that these companies themselves don't actually, you'll find, have a real handle on what's going on.
And that was always an interesting experience for me, going back close to a decade ago, doing this kind of reporting and showing, for example, here's the kind of stuff really going viral on Facebook that's crazy and false or what have you, and I would realize later that this was kind of news to Facebook as a whole. They didn't have the tools and the desire to necessarily be tracking things to that extent. And so I just think there's so much to be found, and we shouldn't assume that these companies have a really good handle on what's going on because the scale is unknowable. Even for them building and operating these systems, they try to build their dashboards, they try to build these internal controls, but the systems are just so big that massive things can hide in a small corner.
That's a good introduction to these four pieces that you had in ProPublica last year, particularly focused on Google in the second half of last year. There are four headlines we'll talk about a little bit. The first, June 14th, last year with Ruth Talbot, Google Says It Bans Gun Ads. It Actually Makes Money From Them. On July 1st, Google Allowed a Sanctioned Russian Ad Company to Harvest User Data for Months, again with Ruth Talbot, but also Jeff Kao and Anna Klühspies, How Google's Ad Business Funds Disinformation Around the World, October 29th, and then most recently, just before the holidays, again with Ruth Talbot, Porn, Piracy, Fraud: What Lurks Inside Google's Black Box Ad Empire. You are a very popular person right now, I'm sure, at Google.
We've definitely had some conversations with them. I will give them credit in that it's always a professional interaction, and as much as we don't get all of the information responses from them that we want, they did engage for the most part on these pieces. But yes, it was sort of a year of Google's business, for me, and in particular for Ruth Talbot, who is a member of our news apps team, and Jeff Kau, another data journalist, and Anna Klühspies was a fellow at ProPublica for a few months. She is a journalist based in Germany.
And the backstory on the stories was that starting about last spring, and even before that, Ruth and I had been talking about digital ads. And for me, my obsession with digital ads goes back to at least about the end of 2016, when I just woke up and realized not just that, well, Google and Facebook make their money primarily from ads, but realizing the scale of the manipulation and fraud taking place in digital ads. At the end of 2016, there was a New York Times story about a digital fraud scheme that had stolen lots of money by pretending to be outlets, news outlets, and getting ads, when there was no real outlet, and just started stealing the money. And I thought, "Wow, that's incredible." I had been writing about false information and fake news, and it hadn't occurred to me that you could have entire fake websites just to earn money. And so I started doing stories about digital ad fraud, and it's been a good five, or close to six years of that. And Ruth and I talked about really wanting to understand these very big complicated ad systems more, and find stories.
And when you're talking about digital ads, you have to talk about Google. It is the biggest digital ad business in the world, but, and this is where it's a reporting challenge, it also operates kind of the guts of the system, the stack of ad tech, where if you want to be involved in programmatic advertising, the automated buying and selling of ads, you basically, in some way, are probably going to be dealing with Google. Whether you are an ad buyer, a brand, whether you are an ad seller, like a publisher, you are probably using some of Google's tools, and you are probably using some of Google's tools to meet in the marketplace to buy and sell those ads. And so we decided to really try to focus on Google's ad business, and to better understand it and more about the kind of fraud and stuff that is lurking in it that I think a lot of people avoid digging into because it's super complicated and difficult to parse.
Let's talk a little bit about this idea of publisher confidentiality, which seems to be very important to Google's model.
So Google runs really the world's largest ad network. So if you have a website, and you want to get ads on that website, you can apply to be part of Google's ad network, and they'll review your site. If it meets all of its requirements, you should be able to just add a little bit of code to your site, and start getting ads, and start getting payments from Google for that. And so Google has publicly said that there's roughly more than 2 million different websites, and potentially apps, also, that are part of this network. It's a massive network, it's the biggest one in the world, for third party sites and apps.
In addition to Google obviously placing ads on its search and all those other things. And the thing that's really unique about it, aside from its size, and aside from the fact that it brought in over $30 billion in revenue in 2021, so it's really big money wise, and pays out better than any pretty much most other ad networks, and is available in more countries than any other ad networks, the fact is that Google... If you wanted to know, "Well, what are the websites and apps that Google actually works with? What are the ones where if you were to buy ads through Google on this network, where might your stuff show up?" Google won't tell you there is no list.
And that may not seem surprising, except for the fact that Google actually worked as part of an industry coalition for several years, to come up with an industry standard of transparency to actually enable ad networks like its, to release all of the publishers, the sellers, the ad sellers it works with. And so this standard came out a few years ago, and Google's competitors basically released their lists and maintained these lists, and Google was sort of like, "You know what? No." So you can't actually know all the websites and apps at any given time where your ads might appear, and who Google is working with and paying money to. And this is unique in the industry. There is no other ad network anywhere approaching a similar size that allows its publisher partners to be confidential as Google lists it.
And so one of the things that we set out as our goal for last year was, can we de-anonymize Google's secret publisher partner list? I mean we know that ProPublica is part of it, we know the New York Times is part of it, because there is a percentage, roughly 20 some percentage, that actually do list themselves publicly, and they're publicly acknowledged. But then there's millions of more sites and apps that you don't actually... The only way you can know that they're in there is if you buy ads on Google and suddenly they show up in your report of where your ads appeared. And that can be a problem, because your ads can show up in some pretty awful places. And so we decided to see if we could actually de-anonymize this list, and make it more transparent and understand what's in there, and why, perhaps, Google might not be so eager to disclose that.
You write that, "ProPublica spent months trying to crack open Google's black box ad business. We wrote thousands of lines of code to scan more than 7 million website domains looking for Google ad activity, sourced and analyzed data on millions more domains from more than half a dozen data partners, and spoke to some of the most knowledgeable experts about Google's display ad business." And you were able to match 70% of the accounts in Google's ad seller list. Apparently the largest data set that's ever been produced by an external party, perhaps more insight than anyone's ever had on Google's ad business.
Yeah, and yet, we didn't get to a hundred, did we? And looking back, in the spring, we were foolishly thinking, "Will they anonymize it? We'll do a large scale analysis of the hundred percent, and we'll really get more insight." And it became clear to us as we were a few months into it, this is really hard. And it's hard for a few reasons. One, is just, again, the scale of it. Two, the fact that Google conceals it, and so you're sort of pawing around in the dark. And three, also because it changes on a really regular basis. Google does actually release these unique IDs for each publisher account, but it won't tell you what apps and websites those accounts are linked to. But we can see, for example, that on a weekly basis Google, might remove 5,000 of these unique account IDs, and then add another 3000 more in.
So they're basically saying, "Well, we've created or activated 5,000 new publisher partners, we've removed 3000, we're not going to tell you what sites and apps they might own, but these are people we're working with." And they're just anonymous strings of letters and numbers. And so we could see that Google was apparently changing the makeup of its publisher partner structure, but we wouldn't know what sites and apps were part of that. And so in the end, 70% is the highest that anyone has ever gotten on that. But everyone that we went to sort of see if they would share data and talk about it, really underscored how difficult it was. And this is something that we encountered even on what would seem like an even more simple task, was we built a tool where we could scan a webpage and determine with a high degree of certainty, whether Google Ads were active on that page, which sounds like a really simple thing, but again, it's deceptively complicated because there are any number of network requests and other things that could result in a Google Ad showing up on a page.
And so that process also took longer, but we did get to the point where we were able to scan millions of domains and be able to determine whether there was an active Google monetizing relationship going on at the moment we scanned it. And then that helped us fill in the blanks, because then we could also match those ad requests to the unique ID. And that's how we worked on it. And we came across really surprising, strange things. For example, a network of dozens and potentially hundreds of manga piracy sites, so like Japanese comic piracy sites, that do an astonishing amount of traffic, and that Google in some cases is directly placing ads on them in clear violation of its own policies against helping monetize copyright infringing material. And so I had no idea manga piracy was such a massive problem, and yet, when we started to dig into Google's network, we see that Google is one of the key monetization partners for these sites.
It doesn't just stop with manga, though. You find other evidence of impacts of this model. Porn, fraud, you even kind of get onto sanctioned websites, or websites that potentially are in countries like Russia where Google is meant not to be doing business in some respects.
Yeah, there were two stories that ended up touching on sanctions. And so the first one I'll talk about is, we did this large scale analysis working with fact checkers in countries on different continents around the world, because we wanted to see how common Google ads were on material that was clearly fact-checked as marked false, and that also was highly likely to violate Google's rules against health disinformation, against climate disinformation, against content that undermines democracy and electoral process.
And so we worked with partners in Bosnia, Turkey, a few countries in southern Africa, countries in South America, and we got data sets from fact checkers, and then were able to use our tool to scan these articles and these websites to see if Google was monetizing them, and nobody had ever done that type of analysis before. And what we found, for example, when we worked with a group of fact checkers in a few countries in the Balkans, was that there was a site, a Bosnian site, that was connected to the family of a kind of genocide denying separatist leader in Bosnia. And this was a website that had specifically been sanctioned by the US Treasury. And this figure, he was sanctioned by US Treasury, he was also sanctioned by the UK government, because he's basically seen as trying to break up the Dayton Peace Accords that had ended the war in Bosnia.
And what we found was Google was placing ads from major brands, Guess, and other luxury brands and recognized brands, on this website affiliated with him and his family. And so Google removed those ads as soon as we alerted them to it, but it shows you the kinds of things that can slip through the cracks. And not only was that a sanction concern, we also did a story working with a researcher who runs a company called Analytics who does great work in analyzing digital ads. And we showed that Google was continuing to send data related to ad bids, so the buying and selling of ads that somebody will say, "Hey, I've got ads available on my website. Here's the type of user. Do you want to show them an ad?"
That's called bit stream data, and Google continued to send huge amounts of bit stream data to this sanctioned Russian company that's owned by a major sanctioned Russian bank. And potentially among that data would've been information about people in Ukraine. And it could have opened up, if for example, Russian intelligence services had gone after that data, we have no idea whether they did or not, but it created a risk that it could have helped fill in some of the picture for Russian intelligence and military services. And so Google seems to have really failed very poorly around enforcing particularly Russian sanctions, since the invasion of Ukraine almost a year ago. And this has been documented not just by us, but by other people as well. And Google's response to that is to basically say, "We do our best to comply with sanctions." They don't get into specifics, and that's one of the areas where they will not really engage with you much.
And I have seen, of course, in your article here, you really traveled the world. You talk about the impact of this model and this phenomenon in Brazil, of course in Eastern Europe and in the Balkans, really just across the planet. Are you able to make a judgment about how many millions or billions of dollars we're talking about here. That are flowing to this sort of black box network? The company tells you in response in the most recent article, that upwards of 70% of the business is essentially going to publishers that do disclose, right?
Yes. That's one of the responses Google had, is that the vast majority of the money flowing through its ad network is going to publishers that are not confidential, that you can look up and see who they are, but they won't actually give specifics on that. So it is a case where you have to take Google's word for it. Similarly, with the manga piracy sites, Google's spokesman didn't say this, but a Google engineer on Twitter engaging with people said, "The traffic related to these confidential sites, whether it's mango piracy, whatever, is very small in terms of the actual ads being placed." And so that's what Google's line on it is. On the one hand, it's saying, "There is no connection between confidentiality and bad actors." It's also saying, "But listen, just so you know, this confidential stuff, it's a tiny amount." And so they're sort of saying, "Well number one, you have nothing to worry about. But number two, just so you know, you have extra nothing to worry about. But we still have all these confidential sites even though there's nothing wrong with it."
And they didn't really give an answer as to why, to this day, they couldn't have made more progress than when they were last called out on this two years ago. They say there's potential privacy concerns and all that, but literally no other ad network has cited that. And so I think this is a familiar piece of territory for you, and I'm sure for some of your listeners, the tech companies, they have the data, it's there, they could pull it, they're not sharing it. And of all of the areas of lack of transparency, there are many, but where we have made some of the least amount of progress, I would argue, is in programmatic advertising. You have Facebook for example, has started to make political ads archived and available. Google has started to make some ads searchable, but in terms of actually getting data about where money is flowing, which is the core of your question, who is getting money? Who is part of these networks, how much are they being paid? There is no transparency. Almost no transparency around that.
And so we are not able to estimate how much the confidential sites get, because we have to rely on only Google's general numbers. And when it came to our investigation into Google funding some of the worst sources of disinformation in these different countries around the world, we can't estimate how much money these sites make, even though some of them told us Google is one of their main sources of revenue, because digital ads are so complicated. An ad placed today on an article shown to me, you could load that same article in your web browser, and you might get a different priced app. And so it is dynamic, it is based on realtime auctions, this data is flowing constantly around the world through many different systems, not only does Google run its own ad exchanges connecting buyers and sellers, but other people's ad exchanges also run through Google. And that's how all these gun ads flow through Google's systems, even though it says it doesn't take gun ads.
And so at the end of the day, we talk about it being a black box ad empire because there is just so much that we are unable to see, and that anyone working outside Google is unable to see. And it's astounding to me that brands are spending billions and billions and billions of dollars on systems where there is a really alarming degree of fraud, where they don't have any assurance really if they're relying on Google to place their ads, which a lot of them do, on where their ads might end up and how bad those sites and apps might be, and where at least 15% of the money is untraceable. For an independent industry study that was done a few years ago, they literally could not trace 15% of where the money went in ad buys.
So I think of all the systems that we are interacting with as consumers around the world, the digital programmatic ad systems are the biggest of all the black boxes, the craziest because of all the money that flows in and nobody knows where it goes, and has the highest potential risk for funding, organized crime and lots of terrible things, but we can't actually put our finger on it and prove it, because it is so opaque and difficult to track.
I just want to pause on the story about guns for just a moment. You point out that Google has, for most of its existence, claimed that it doesn't accept gun ads, but your analysis, again, you found 15 of the largest firearm sellers in the United States. This is everybody from Daniel Defense, the company that makes the AR 15, on through to many others, are essentially running Google ads. And that the sites that are accepting those ads, of course, are making money from that as well. Just kind of extraordinary. I suppose I should ask you, because you are... What was Joan's term? Was it Grandpa of Disinfo?
Grandpa Disinfo. I don't know why I'm helping popularize that. This is the Streisand effect.
Okay, maybe we'll avoid trying to popularize it, but because you do have a years-long purview on this, you mentioned, of course, concerns raised, letters written by Senator Mark Warner, for instance. Where is Congress on this? Where is any kind of legislative recourse to these problems?
Well, this is one of the things that is different now. If you had asked me this a couple years ago, I would say they're nowhere. Aside from Warner and a few others occasionally sending a letter to the FTC or calling on Google or whatever, there was nothing. But there's been... As you well know, technology regulation, big tech regulation, is one of the few potentially bipartisan areas right now in Congress. And there are touchpoints of bipartisanship, but the two parties have very different goals and perceptions of why big tech needs to be regulated, except they agree that it's too powerful, but for different reasons and different outcomes. And so Senator Mike Lee has proposed legislation around digital advertising, Senator Klobuchar, so we've got a Republican and a Democrat, both of them have shown interest in regulating digital advertising. Mark Warner has had a long-standing interest and Schumer actually had an interest years ago, in concerns around digital advertising fraud.
And so Congress, it has been trying to come up with legislative approaches to digital advertising. And one of the guiding principles that's out there in some of these legislative proposals is the idea that you should regulate programmatic advertising the way you do financial markets, which I think is a really interesting idea, because you do have this scenario of exchanges of buyers and sellers coming together, and it's supposed to create the fairest possible model. But what happens is that in programmatic digital advertising, Google has... If you pick every piece of the stack of the tools and platforms people need for buying ads and selling ads, and if you look at the exchange element of where they meet to make that purchase or sale, Google is basically the dominant player in all of them, to different degrees. But taken as a whole, there is a concern about Google's monopoly in digital advertising that is animating Congress as well.
And so one of the things that they want to do is mandate elements of transparency and control, so that Google is not able to potentially pollute or otherwise make these auctions less than fair. And there have been independent cases alleging Google is doing that, but as of right now, Congress has a desire, but really at this point, on the day we're speaking, it's a gong show of who's going to be Speaker of the House. The Senate is clear, but are we going to see any legislation happen? That's really not my area of expertise, but I think there's a desire, but it's hard to say whether this is the type of big tech stuff that goes to the top of the agenda.
I think it's also just really hard at this point to discern how many events that we're seeing in the world are driven by the dynamics that you're reporting on. To some extent, I've just finished reading the January 6th committee's report and talking with some of the people that did the social media aspect of that investigation, and one of the things that's very clear is that there are a lot of people making money on the big lie, that the election was stolen from Donald Trump. And, of course, all of that is powered by and monetized by this ad tech ecosystem that allows individuals that are able to draw attention to their claims and their websites, to essentially earn money from it. And a lot of that's running straight through Google. So that's just one example that occurs to me. You've obviously reported on and offered so many others. But it's beginning to be difficult to discern what in the real world is disinformation and what's just grift, and how much of those things are swirling around in a pot together.
Yeah. And you often will see people realizing, "Well, I genuinely believe the election was stolen and now I'm going to build an infrastructure around it, and monetization and funding is going to be a key part of that." And so whether they're coming at it genuinely and trying to figure out how they create a sustainable operation or a business around their views, or they're looking at it and saying, "Man, there are some suckers to be had here. How do I target the suckers and extract money from the suckers?" Whether it's indirectly through their attention placing ads, or directly through e-commerce. This is a key piece. I mean it's an old adage, but following the money is really important. And the reality is that when it comes to people making money through digital ads, Google has, for a long time, been sort of the main checkbook there.
If you can get your sites into Google's network, you're going to earn more money than if you're on some of the other lower paying junkie ones who will have a lower standard than Google. It's not that Google has no standards, what I think we've really shown is that Google operates its ad business and its ad tech at a scale that it is unable to manage, and unable to effectively enforce the policies it says it has. Which again, sounds like a familiar refrain, insert the company and insert the thing, when it comes to big tech. But in this case, it's a very clear problem, because it results in money, it results in funding to some of these worst actors around the world. That's why we wanted to go global and get out of the US mindset, and show that in fragile democracies in the Balkans, some of the absolute worst offender websites, some of the places... In that one case, this genocide denying separatist politician trying to tear the country apart, this TV station connected to his family is making money from Google.
And Google talks about how it creates economic opportunity for people around the world by enabling them to get ads through Google, which is true. If you're a really small publisher and can get into Google's ad network, that's a great opportunity. But if Google is going to operate in these countries, it has to have oversight. And what I think we showed in that disinformation investigation is that Google is operating in these countries, it is a major funder of some of the worst actors, and Google does not have the oversight in the language of the country to actually do the job. And we spoke to a former Google leader with insight in this area who talked about, "Well, imagine you're the country manager for Serbia, how much is it going to cost you to train up a whole team of content reviewers in your language, and how much money is your country generating versus how much would that big oversight operation cost?"
Well, all of a sudden you're going to run your country at a loss. And Google's going to be like, "What the hell's going on there?" And so I think the money piece is really important and it's astounding to me how overlooked the digital ad ecosystem is as the key funder of this. And I just hope... I feel like I've been banging my head against the wall for five or six years on this. I really think other journalists, and just the world in general, needs to pay more attention to ad tech and to the funding of this stuff, and also to the fact that just tens of billions of dollars disappears every year into a black hole, and nobody knows who gets that money. How is that a business? How is that something that is digital advertising? It blows my mind.
Well, I know that you'll continue to be on this beat, and I will point out to my listeners that at the bottom of all your stories, there is a place where individuals who may have information that could be useful to your investigation can get in touch. But at the bottom of the article on gun ads in particular, there is an entire survey where you invite individuals who may have information about Google's ad business and these issues in particular to get in touch and provide encrypted options as well. So if there's anybody listening to this that I suppose wants to give Grandpa Disinfo his next tip, you'll know where to reach out.
It's always appreciated. My Twitter DMs are open and I am on signal, which you can find from my bio page on ProPublica. Always interested in hearing from people, even if you don't have something you think is super interesting or revelatory, always happy to hear from people who have knowledge and interest in this area. I struggle to break through and get these stories to be readable for the average person, and so if folks want to take time and read them and have thoughts to share, always happy to hear about that as well.
Craig, thank you for speaking to me about all of these articles, and again, thank you for sharing your methods and your tools in the way that you work with this community. Having sat in some of those sessions and having had you obviously speak to my students in the past, I'm very grateful to you for that.
Thank you. Yeah, I don't think I lose anything by sharing new tools and approaches that I come across. And I think there's a great community of journalists and other folks, researchers, who are pushing the field of digital investigations, and it's one that I think is open to everybody. You don't need to be a journalist. There's lots of great people in academia, in think tanks, and just amateur folks noodling around doing interesting investigative work. So I think it's a great team effort, and we always win by sharing what we know, so I'm happy to do it and always happy to learn from others.
Thank you, sir, and happy New Year.
Thank you, same to you.