Home

Assessing the Problem of Disinformation

Justin Hendrix, Rebecca Rand / Sep 10, 2023

Audio of this conversation is available via your favorite podcast service.

This episode features two segments on the subject of disinformation.

In the first, Rebecca Rand speaks with Dr. Shelby Grossman, a research scholar at the Stanford Internet Observatory, on recent research that looks at whether AI can write persuasive propaganda.

In the second segment, Justin Hendrix speaks with Dr. Kirsty Park, the Policy Lead at the European Media Observatory Ireland, and Stephan Mündges, the manager of the Institute of Journalism at TU Dortmund University and one of the coordinators of the German-Austrian Digital Media Observatory, about the new report they authored that looks in detail at baseline reporting from big technology platforms that are part of the EU Code of Practice on Disinformation.

What follows is a lightly edited transcript of the episode.

Justin Hendrix:

Good morning. I'm Justin Hendricks, editor of Tech Policy Press, a nonprofit media venture intended to provoke new ideas, debate, and discussion at the intersection of technology and democracy. Today's episode features two segments on disinformation. The first, a look at recent research on whether AI can write persuasive propaganda, and the second on a report that looks in detail at whether the big technology platforms that are part of the EU code of practice on disinformation are adequately reporting under the terms of the code.

We're starting off today with another installment of Evidence Base, our short segment highlighting new research on how technology interacts with people, politics, and power. And for that, we have our audio intern here, Rebecca Rand.

Rebecca Rand:

Hello, Justin.

Justin Hendrix:

Hello, Rebecca. We're looking this week at a study about large language models like OpenAI's ChatGPT and propaganda.

Rebecca Rand:

Totally, so not long ago I talked to a researcher at Stanford.

Shelby Grossman:

My name is Shelby Grossman and I am a political scientist and a research scholar at the Stanford Internet Observatory.

Rebecca Rand:

So Dr. Grossman works with a team.

Shelby Grossman:

Josh Goldstein, Jason Chao, Mike Tomz, and Alex Stamos.

Rebecca Rand:

And they're interested in AI-generated propaganda.

Shelby Grossman:

We focus on studying online safety issues, and one of those issues at the moment is the use of large language models and how those models might be abused by disinformation actors.

Justin Hendrix:

A worrying prospect.

Rebecca Rand:

For sure. So before ChatGPT was released to the public last fall, OpenAI gave access to GPT-3, its predecessor, to researchers like this team at Stanford.

Shelby Grossman:

In 2021, my team got early research access and we were just playing around with it and immediately thought, wow, this technology is going to be very useful for foreign disinformation actors, and so we decided to develop a study to systematically test whether this technology would be able to create persuasive text.

Rebecca Rand:

So Dr. Grossman and the rest of the team came up with this experiment. They found some old-fashioned human-generated propaganda online.

Shelby Grossman:

We identified six articles that either originated in Russia or originated in Iran. So for example, one of these articles was claiming incorrectly that Saudi Arabia offered to fund the US-Mexico border wall. Another article was also incorrectly stating that the US had fabricated reports that Syria had used chemical weapons.

Rebecca Rand:

So they took these six articles and fed them into GPT-3, like, "Make more of these." Then they had a group of 8,000 Americans they would show these articles to. Well, not all of them. Some saw the AI articles, some saw the originals, and then a control group didn't read anything. The researchers just asked them point-blank if they agreed that, say, Saudi Arabia had offered to fund the border wall or whatever. And as you might suspect, the researchers did find that GPT-3 could generate convincing propaganda, close to on par with the human-generated stuff.

Shelby Grossman:

The headline findings is that if you show people the original propaganda article, then the average agreement with the main point of that article is about 47%. When you show people the GPT-3 generated content, average agreement is about 44%. So it's very slightly worse than the original propaganda, but pretty good.

Rebecca Rand:

Now, there's a few things to take into account here which surprised Dr. Grossman, and me to be frank. If you just ask these folks point-blank, do you believe that Americans lied about Syria using chemical weapons, a little over 20% of people will just say yes.

Justin Hendrix:

And this was without reading any propaganda.

Rebecca Rand:

That's correct, and that was true for a number of claims. Just by asking the question, even if people had no preconceived notions one way or the other, a decent number of people will just agree.

Justin Hendrix:

That reminds me of articles I see on certain disreputable outlets that seem to get away with spreading misinformation because they pose a question in a headline like, is COVID-19 real? And since it's a question, technically it can't be a false claim. But just asking the question, it certainly implies it.

Rebecca Rand:

Yeah. Dr. Grossman talks about that. She also said that she was surprised at how persuasive the human-generated propaganda was in the first place.

Shelby Grossman:

So I was actually pretty surprised that the persuasiveness of the original propaganda was that big because the original propaganda was not written very well. It's the kind of writing where you have to read every sentence three times to really absorb it. It wasn't written professionally and there were a lot of grammatical issues, so I was shocked that you just show people these articles and then agreement is on average 47%.

Rebecca Rand:

So another thing to point out here is that they were testing this all using GPT-3, not the ChatGPT we're all familiar with now.

Shelby Grossman:

We were using GPT-3, which is not as good as ChatGPT. So our effects are really a floor. Almost certainly, ChatGPT would do even better.

Rebecca Rand:

And another thing the team looked at was what if a foreign actor used humans to edit and polish up the AI-generated stuff?

Shelby Grossman:

We did exactly that. We copyedited the prompt and that actually did increase the persuasiveness of the output and it made it such that the output was as persuasive as the original propaganda.

Justin Hendrix:

Well, that's not good. So where does Dr. Grossman think all this is going?

Rebecca Rand:

That is a very good question, and here's what she said.

Shelby Grossman:

One thing that propagandists will be able to do, if they aren't already doing so, is use these models to customize content for particular audiences. So we know that the Internet Research Agency did this in 2016.

Rebecca Rand:

So just to jump in here, the Internet Research Agency is the famous Russian troll farm.

Shelby Grossman:

It created content pushing, for example, the same messaging about Syria, targeting both liberals and conservatives in America. It would customize the text to reach these different audiences. And presumably, that was pretty time-consuming for them to do. It required a lot of local knowledge about cultural norms here. And so with large language models, actors will be able to do that much more easily and they'll be able to do it at scale and they'll be able to do it pretty cheaply, so that's one implication.

Another implication is that a common way that researchers identify coordinated disinformation campaigns at the moment is through what we call copy pasta, so this is when text is just copied and pasted throughout the internet. So a propagandist invests in writing an article. They don't just want to use it once, so they copy and paste it and get it posted to random conspiratorial websites around the internet. And so what that means is that when we see content that raises some red flags, we can just copy and paste sentences from it, put it into Google in quotes, and then find other places where this propaganda has appeared. And that can be useful for all sorts of reasons. It can help us gain insight into attribution, who's behind this network. But with large language models, they won't have to reuse the language. They can just regenerate 100 articles making the point that Saudi Arabia offered to fund the US-Mexico border wall and just get that stuff out there.

Justin Hendrix:

So it will be harder to identify and label propaganda?

Rebecca Rand:

Correct.

Justin Hendrix:

This research is important. I'm glad it's being done, but how do we know that we aren't just giving people ideas?

Rebecca Rand:

Yeah, that was something that was very important for Dr. Grossman to address. Here's what she said.

Shelby Grossman:

Our research was funded by HAI at Stanford, and to get that funding, we actually had to go through a novel cross-professional ethics review board where we had to discuss how we're thinking about risks to society. And so our thinking on this topic is that, first, it's very unlikely that our paper is going to be the first one to introduce to foreign actors that these large language models are out there and that they have these capabilities. And then second, we are saying that human machine teaming can increase the persuasiveness of the output. And so some might say, "Oh, well, that's going to give these actors an idea that they should be doing that." But of course, they're already doing that. Disinformation campaigns are already very good at adopting the latest technology of the day, so our thinking is that it's very unlikely that our research is going to give useful ideas to propagandists that they don't already have.

Justin Hendrix:

So basically, propagandists aren't idiots. They're already probably using it.

Rebecca Rand:

Correct.

Justin Hendrix:

Are you using it, Rebecca? How do I know you didn't just ask ChatGPT, "Make me a podcast episode about ChatGPT and propaganda?"

Rebecca Rand:

Justin, I am deeply, deeply offended.

Justin Hendrix:

I'm not even certain you are entirely real. This could well be a synthetic voice.

Rebecca Rand:

Are you even real, Justin?

Justin Hendrix:

Oh, great. What a terrifying time to be alive and doing a podcast. Rebecca, thanks for walking us through Shelby Grossman's work.

Rebecca Rand:

You're very welcome.

Justin Hendrix:

Next up, we consider a report published last week by a group of research institutions co-funded by the European Union that offers an assessment of how well very large platforms did in providing information under the strengthened code of practice on disinformation. The analysis covers the reports published by Google, Meta, Microsoft, TikTok, and Twitter, now X, though it's important to note that Twitter pulled out of the code of practice, which is voluntary, after submitting its response. No wonder then that the report finds that "the inescapable conclusion of this document is that Elon Musk's Twitter failed every single indicator and gave every impression of blatant non-compliance, e.g. by self plagiarizing much of their report from previously published boilerplate text." To find out how the other companies did and how the code itself will evolve under the newly implemented Digital Services Act, I spoke to two of its authors.

Kirsty Park:

So I'm Dr. Kirsty Park. I'm a postdoctoral researcher at Dublin City University's Institute for Future Media, Democracy and Society, and I'm also the policy lead at the EDMO Ireland hub.

Stephan Mündges:

I'm Stephan Mündges. I'm the manager of the Institute of Journalism at TU Dortmund University in Germany, and I'm one of the coordinators of the German-Austrian Digital Media Observatory.

Justin Hendrix:

I'm so pleased to have the two of you to speak to me today, the two main authors of this report on how the big online platforms have performed under the strength and code of practice on disinformation. I want to start just with a little context for any of my listeners that aren't following European policy that closely. What is the code of practice on disinformation?

Kirsty Park:

So the code of practice on disinformation was introduced in 2018 and it was basically designed to be a voluntary self-regulatory code that would bring together some of the big platforms and have them really make commitments towards ways that they were going to tackle disinformation. So in areas like advertising, making sure that people aren't profiting from redirecting to false websites or having misleading information as part of their advertisements, those types of things. And what happened with that is they had it running for a few years and found that really it wasn't working all that well because there was just too much leeway in what platforms could say. And so there wasn't really any structure, there wasn't really any clarity about exactly what was required.

And so, based on various monitoring and assessments that took place, they released a new strengthened code of practice in 2022. And this code, it has lots of different signatories, so it also has fact-checking organizations and various NGOs who all also play a role in shaping what it looks like. And I suppose the big thing is, it really changed the structuring requirements so that the reporting requirements are much more structured now. So it has essentially measures relating to each commitment, and each of those measures have key performance indicators that are either qualitative or quantitative or both.

Justin Hendrix:

Let's be clear, this is a voluntary code of practice, so these companies have chosen to be part of this. And we've also, I suppose, seen a couple of companies choose no longer to be part of it.

Stephan Mündges:

Yeah, and that's correct. The big platforms all chose to become signatories, so Google, Meta, Microsoft and TikTok are still signatories. Twitter was a signatory and left the code a couple of months ago, though they did hand in the sets of so-called baseline reports that we analyzed for our work, so they are in the scope of the assessment.

Justin Hendrix:

So let's talk about what you looked at. You received, I assume, a ton of material from these different platforms, both qualitative material that they produced as well as quantitative information. What was the process on your side and how were you resourced to look at disinformation?

Stephan Mündges:

What we did is we analyzed the so-called baseline reports by the platforms. Early this year, they published reports that they are obligated to hand in every six months according to the code of practice. And in these reports, they provide a ton of information on several reporting requirements, so they supply information regarding policies, regarding enforcement of policies. They also are supposed to supply a lot of data. In the terms of the code of practice, these are so-called service level indicators, which is basically quantitative data. We analyze these reports which are roughly around 150 to 250 pages per signatory, per company. And what we did is we developed an assessment scheme, so we look at the different reporting elements, look at the measures that are foreseen in the code. The code is divided in different sections, for example, regarding political advertising, regarding empowering users, empowering fact-checkers or researchers. And for each section, there are numerous measures which are very specific and which are subdivided in very specific reporting requirements. We looked at whether these reporting requirements had been met by the platforms.

Justin Hendrix:

I suppose, not jumping too straight to it, but were the reporting requirements met by the platforms? What did you find overall?

Kirsty Park:

Overall, a lot of room for improvement. What we did is we graded each measure and we were able to do that because, again, these are very specific information requests. So if a company is asked to provide X, Y, and Z and they did not do so, then we can assign that a grade essentially. So with the overall results, what we found is that generally everything was hovering around adequate. Nobody scored anywhere near good on the scale. And some platforms in particular scored below adequate. I think it was only Google who scored above adequate and only slightly above adequate. Everything else was really below adequate. In various sections, some were vastly below and venturing much more into the poor area of the scale.

And really, what we found was that there was just a lot of missing information and a lot of unclear information, and still a problem that we had in the previous version of the code, a lot of irrelevant information where things are being reported, maybe moreso suitably for a company blog where you're trying to make yourself look good rather than addressing specific reporting requirements and giving the information that is needed.

Justin Hendrix:

It looks like there was a disparity between the qualitative information provided and the quantitative information provided. Were the companies better at kind of giving you text as opposed to hard numbers?

Stephan Mündges:

Yeah, that was definitely one of the findings, that quantitative data that they were obliged to report was missing in a lot of instances. This is particularly the case because the code obliges the platforms to provide member state-specific data, so specific data for the member states of the entire European Union. And they sometimes reported overall numbers which are applicable to the entire EU, but not member state-specific. They sometimes used methodologies that are not robust in our view to compute certain metrics or they simply didn't provide them at all, stating in some cases that they didn't have the time to gather these data points. In terms of qualitative versus quantitative reporting, there's a lot of room for improvement regarding the latter.

Justin Hendrix:

Did you personally interact with the platforms or their executives in any way in this process? How does it work? Do folks get in a room and talk about these results? Will you hear any response from the companies about your assessment of their work?

Kirsty Park:

No, we haven't had any engagement with platforms at all before or after yet to see if that will happen. But I suppose all the reports themselves are publicly available. There's a transparency center as part of the code, which is at disinfocode.eu, so anybody can download these reports and look at them. And I suppose how we ended up involved in that process is there is no specified monitoring scheme of who exactly will do what, but there are various organizations mentioned such as ERGA, which is a platform of European regulatory authorities, and also EDMO, which is the European Digital Media Observatory. And we are essentially EDMO hub, so we represent our national EDMO bases. And as part of that, we undertook this work because it's part of our roles and responsibilities to help monitor the code.

Justin Hendrix:

So it's fair to say that Europe is building a kind of infrastructure across its member states to do this work, to build these digital media observatories, and to connect them?

Stephan Mündges:

Yeah, I think that's fair to say. What the beauty of EDMO is basically is that it's co-funded by the European Union, but it's still an independent body, so it can look at disinformation coming from foreign as well as domestic actors. It can do independent evaluation of reportings such in the case of the code of practice and that it's a big collaboration of researchers and journalists and media literacy expert to really address this and misinformation from a multidimensional approach, because it's a multidimensional problem and we can address it quite substantially.

Justin Hendrix:

This report does go into so many different specifics on the different measures and dimensions that you've looked at, and I think one of the ones that interests me most about the code of practice is the emphasis it puts on trying to build a healthier ecosystem and the role of the platforms in doing that. So things like not just empowering users but empowering outside researchers such as yourselves, empowering the fact-checking community. Can we talk about those last two, just as an example? What did you find when it came to the question of empowering researchers? How are these large platforms performing on that front? We've had news lately about different platforms of course restricting access to APIs, whereas on the other hand, other platforms like Google have launched new APIs for academics to be able to take data.

Kirsty Park:

Yeah, and this is a longstanding issue academic researchers have been finding with social media platforms. And it's really, I suppose, a bit of a black box. There's so much we don't know because we can't access the correct data. And platforms have said since the 2018 version of the code that this is something they're committed to. But what we found in this section of the report or the reporting was very much that there's still just a lot of vagueness. There's still just not much clarity, not much detail given. And also, I think it's also a bit easy to slightly game the answers in a way because what we found for instance was Twitter had acknowledged their API and their research or API and what a great resource that was, but within a few weeks of submitting their report, they announced that they were shutting it down.

And we've seen similar issues too, for instance, with Meta, there's been a lot of talk about how the CrowdTangle team and resourcing dedicated to that has been gradually dwindling over time, but yet it's still something they're acknowledging within their reports. So overall, I think that scored the lowest and I think that is reflective of the fact that there's still a lot of work to be done here, and it is something that hopefully will change too under the Digital Services Act, but I think there's a lot of work still to be needed there.

Justin Hendrix:

Stephan, what about the fact-checkers? Have the platforms done enough to help the fact-checking community?

Stephan Mündges:

Overall, I would say no. So the section in the code of practice that concerns the empowering fact-checking has the overall goal of achieving fact-checking coverage of all EU member states, and this has definitely not been achieved. Except for Meta, none of the platforms cover all EU member states, and there was also very important details and qualitative information missing from the reports, especially regarding funding. So signatories and platforms are supposed to report how they fund fact-checking operations in Europe, and except for Google, none of them do.

Justin Hendrix:

I want to ask about the Digital Services Act and the way that the code of practice may evolve into a code of conduct. What does that mean? How might that work? What it would mean for this type of reporting? Digital Services Act of course has just been enacted, so things are just underway I suppose at the moment, yet this is one of the things where there's still a little bit of lack of clarity in terms of how exactly things are going to play out. What can we expect to see with regard to this transition from a code of practice to a code of conduct? Is it necessarily going to happen? And if it does, how will it happen?

Kirsty Park:

I think it is very likely because the strengthened code was designed to interact with the DSA. And so how we expect that to happen is that the largest platforms, so the very large online platforms and search engines, which are those with over 45 million users, have obligations to identify and mitigate systemic risks as a result of their services. And that's obviously a massive task of how do you begin to operationalize that? And so the code of practice on disinformation is really, I suppose, a good example of how it's expected this might occur, that it's already very much set up. There's a lot of detail in setting up the benchmarks for what complying with those risk assessments for disinformation might look like.

And then what will happen is if it becomes a code of conduct, it's still actually voluntary. Nobody has to participate in it. But if a platform chooses not to participate, that may be taken into account when the European Commission is assessing whether they have obliged with their obligations under systemic risk assessments. So it's I suppose a co-regulatory backstop, you could call it, that it's not going to be directly enforceable, that you can't sanction somebody just for not complying with the code of practice, but it will very much feed into determining how seriously platforms are taking their responsibilities with disinformation.

Stephan Mündges:

I would just like to add that it sounds like a very complicated mechanism and it's not very straightforward. So nobody's going and saying, "Okay, people are not allowed to post these kinds of contents." And I think that's very important because disinformation as a phenomena and as a challenge to open and democratic societies is very difficult to pin down exactly, because it's a number of different forms of content of different behaviors, actors in the field. So it's very difficult to really regulate disinformation properly. So I think it's beneficial to approach it from a content-neutral point of view and not regulate content, but regulate procedures, give incentives to enforce self or co-regulatory mechanisms. So I'm quite optimistic that we can achieve progress with these mechanisms, and hopefully nonprofit and civil society organizations are going to be involved in the process as well.

Justin Hendrix:

From the outside, looking at what has been put in place here, it does seem like the right types of things to intervene around. It's not intervening on the content level, as you say, but intervening on the economic incentives that are in place for perhaps there to be more disinformation, intervening on the ecosystem level, thinking about fact-checking and thinking about the knowledge that we have about these phenomena and the extent to which they interact with democracy. So what's next for you, and for the two of you? Will you do another one of these reports in a year's time?

Kirsty Park:

Well, there's a lot of interest across various EDMO hubs of continuing work into the code of practice. I think the next reports are coming out very soon later this month, and there's obviously no point applying the exact same thing we've just done. I think we'll give the platforms a chance to incorporate feedback and see how things develop, and then perhaps early next year we might conduct something like this again. But there's also a big role, and I think this is really important to clarify, that monitoring shouldn't just be checking if platforms have provided the right information, but also assessing is the information true? Can we actually verify that this is correct? And so what we see as being a large potential for future research that we would certainly hope to be involved in is to do more investigatory work and perhaps attempting to verify certain claims. And that's a really important part of monitoring too, that perhaps it takes a bit more time and effort and a specific skillset, but I think it's important that both those types of monitoring take place.

Justin Hendrix:

And Stephan, anything you want to add there?

Stephan Mündges:

Just one thing. We're working on it, this more investigatory approach, and stay tuned.

Justin Hendrix:

Stepping back from the work you've done on this report, Europe has number of elections coming up in 2024, including of course parliamentary elections. Are you more or less concerned having done this report about the phenomenon of disinformation and whether we're doing enough to address it in the context of what could be a pretty important political year?

Kirsty Park:

I would say I am encouraged that there's more accountability being directed towards platforms. So I wouldn't necessarily say that I'm enthusiastic for what platforms will do, but I feel like it's a positive step that there will be lots of eyes and focus and attention on what exactly they do.

Stephan Mündges:

And I'd add that I think the EU and the EU institutions have very powerful tools at the hand, and they're a very important actor that can actually take on big tech. However, I think it's important that it's going to be enforced properly and not as slow as sometimes regulation takes place. And I'm a bit worried when I look to the US, when I see big platforms taking back policies that had against vaccination skeptics. Again, when you look at Twitter, the turn this platform has taken, that worries me actually quite a bit, and I hope that Europe can stop a similar developments in Europe.

Justin Hendrix:

Yeah, I suppose that's one of the ironies or tensions that has developed just as the Digital Services Act comes into effect and just as there would apparently be these new compliance requirements in place, you are seeing reporting essentially a lot of these larger online platforms are reducing the scale of trust and safety teams and pulling back from some of the types of activities that, if you were to imagine they were scared of compliance with Europe, they perhaps would be investing in rather than receding from.

Kirsty Park:

Yeah, definitely. And I suppose, I think something that we are seeing is that Elon Musk's direction with Twitter has perhaps changed the tone a bit generally and that it's made platforms a bit more willing to take risks regarding these things. Because if the European Commission is going to go for anybody, it's probably going to be Twitter as opposed to somebody else. And I think there is perhaps a bit of a wait and see as well of, well, how strongly will these things be enforced? Perhaps how much can we get away with? And that might then be the signal of how much effort to actually take.

Stephan Mündges:

To add what Kirsty just said, even if the platforms comply by European rules, they might only apply these European rules in Europe and that leaves the rest of the world quite exposed. And what we've seen over the last couple of years is that platforms don't live up to their responsibilities in countries in the Global South and that they leave societies vulnerable to misinformation and to hate speech. And I would hope that if they find ways to counter disinformation, misinformation, hate speech, et cetera, proactively and successfully in Europe, that they apply the same standards to the rest of the world.

Justin Hendrix:

Well, we'll see what happens in the year ahead, whether Elon Musk will present the first great stress test of the Digital Services Act. Hopefully there will be forward progress on this code of practice. It will perhaps become a code of conduct and we'll see improvement across the board in the behavior of these platforms with regard to disinformation. That will be in no small part thanks to your work here, to look very closely at checking their homework, so I thank the two of you for speaking to me today.

Kirsty Park:

Thank you very much.

Stephan Mündges:

Thanks, it was a pleasure.

Authors

Justin Hendrix
Justin Hendrix is CEO and Editor of Tech Policy Press, a new nonprofit media venture concerned with the intersection of technology and democracy. Previously, he was Executive Director of NYC Media Lab. He spent over a decade at The Economist in roles including Vice President, Business Development & ...
Rebecca Rand
Rebecca Rand is a journalist and audio producer. She's pursuing her Master's degree at CUNY's Craig Newmark Graduate School of Journalism. In summer 2023, she is an audio and reporting intern at Tech Policy Press.

Topics