How to Regulate Unsecured “Open-Source” AI: No ExemptionsDavid Evan Harris / Dec 4, 2023
Unsecured “open-source” AI systems pose a massive series of threats to society and democracy. They deserve no exemptions and should be regulated just like other high-risk AI systems. Their developers and deployers should be held liable for the harms that they create, whether through their intended uses or foreseeable misuses, says David Evan Harris.
Introduction: Not Open and Shut
When most people think of AI applications these days, they are likely thinking about “closed-source” AI applications like OpenAI’s ChatGPT—where the system’s software is securely held by its maker and a limited set of vetted partners. Everyday users interact with these systems through a web interface like a chatbot, and business users can access an Application Programming Interface (API) which allows them to embed the AI system in their own applications or workflows. Crucially, these uses allow the company that owns the model to provide access to it as a service, while keeping the underlying software secure. Less well understood by the public is the rapid and uncontrolled release of powerful, unsecured (sometimes called “open-source”) AI systems.
Non-technical readers can be forgiven for finding this confusing, particularly given that the word “open” is part of OpenAI’s brand name. While the company was originally founded to produce eponymously open-source AI systems, its leaders determined way back in 2019 that it was too dangerous to continue releasing the source code and model weights (the numerical representations of relationships between the nodes in its artificial neural network) of its GPT software to the public, because of how it could be used to generate massive amounts of high quality misleading content.
Companies including Meta (my former employer) have moved in the opposite direction, choosing this year to release powerful, unsecured AI systems in the name of “democratizing” access to AI. Other examples of companies releasing unsecured AI systems include Stability AI, Hugging Face, Mistral, EleutherAI, and the Technology Innovation Institute. These companies and like-minded advocacy groups may be close to successfully lobbying the European Union to push through exemptions for these unsecured models (see proposed EU AI Act Recitals 12a-c). They may also push for similar exemptions in the US via the public comment period recently set forth in the White House’s AI Executive Order.
I spoke out in June about the risks of open-source AI, but it is worth contextualizing my concerns further here. I am a long-time participant in the broader open-source movement, and I believe that open-source licenses are a critically important tool for building collaboration and decentralizing power across many fields. My students at the University of California, Berkeley have contributed ~439,000 words to Wikipedia, one of the biggest open-source projects in the world. The Global Lives Project, an organization that I founded almost 20 years ago, has contributed close to 500 hours of video footage of daily life around the world to the Internet Archive, under Creative Commons licenses. I’ve also spoken at (and thoroughly enjoyed) the Wikimania conference and attended more Creative Commons events and conferences than I can count.
I think the open-source movement also has an important role in AI. With a technology that brings so many new capabilities to people, it is important that no single entity can act as a gatekeeper to its use. The many benefits of open-source AI systems are discussed in other articles. However, as things stand today, the power of unsecured AI poses an enormous risk that we are not yet in a position to contain as a society.
Luckily, there are numerous options by which we could achieve many of the benefits offered by open-source AI systems without the risks posed by further release of cutting-edge unsecured AI. I should also clarify that I am a proponent of the notion of regulation tiers or thresholds—not all unsecured models pose a threat, and I believe that if AI developers can in the future demonstrate that their unsecured products are not able to be repurposed for harmful misuse, they should be able to release them.
In the past few months, I’ve traveled to Washington and Brussels to meet with policymakers who are racing to enact AI regulations, including people directly involved with developing the Biden administration’s Executive Order and the EU AI Act. Though I’ve worked on a variety of issues in the field of responsible AI, from fairness and inclusion to accountability and governance, the one issue that the policymakers I met seemed to most want to talk about with me is the question of how to regulate “open-source” AI. Many countries have begun the process of regulating AI, but none have yet firmly landed on a posture regarding unsecured “open-source” AI systems. In this article, I explore specific options for regulations that should apply to both secured and unsecured models at varying levels of sophistication.
The White House’s recent AI Executive Order does not mention the term “open source,” but instead uses the related, and more specific term, “Dual-Use Foundation Models with Widely Available Model Weights.” The term “dual-use” refers to the fact that these models have both civilian and military applications. “Foundation models” are general purpose AI models that can be used in a wide variety of ways, including creation or analysis of words, images, audio, video or even design chemical or biological outputs. The Executive Order states, “When the weights for a dual-use foundation model are widely available — such as when they are publicly posted on the Internet — there can be substantial benefits to innovation, but also substantial security risks, such as the removal of safeguards within the model.”
The White House was wise in choosing not to use the term “open source,” for multiple reasons. Firstly, open source is both a reference to availability of source code and to legal licenses that allow for unrestricted downstream use of said code. These licenses are meaningless when addressing threats posed by sophisticated threat actors (STAs for short, i.e., nation-states, militaries, scammers) who are already operating outside the law and thus don’t care about license terms. Open source is also not yet a clearly-defined term for AI, with some rightly pointing out that openness is a spectrum, not a binary distinction, and this debate may not be over any time soon.
Unfortunately, while accurate, the term “Dual-Use foundation Models with Widely Available Model weights,” doesn’t really roll off the tongue or keyboard easily. Even making an acronym like DUM-WAM (the best I could do, even with some creative recapitalization above) conveys little meaning outside of perhaps onomatopoeically representing the feeling and sound of banging one’s head on a keyboard thinking about what a bad idea it is to offer unfettered access to dangerous AI systems to anyone in the world.
As such, for both convenience and clarity I will use the term “unsecured” as shorthand for this accurate-if-not-svelte term from the Executive Order. “Unsecured” is intended to convey both the literal choice to not secure the weights of these AI systems and also the threat to security posed by these systems. Readers can, as such, also think of this article as an early submission to the aforementioned public comment process set in motion by the Executive Order.
Understanding the Threat of Unsecured—and Uncensored—AI
A good first step in understanding the threats posed by unsecured AI is to try to get secured AI systems like ChatGPT, Bard or Claude to misbehave. You could ask them to tell you how to make a bomb, design a more deadly coronavirus, make naked pictures of your favorite actor, or write you a series of inflammatory text messages directed at voters in swing states to make them more angry about immigration. You will likely receive polite refusals to all such requests, because they violate the usage policies of these AI systems. Yes, it is possible to “jailbreak” these AI systems and get them to misbehave, but as these vulnerabilities are discovered, they can be fixed.
Enter the unsecured models. Most famous is Meta’s Llama 2. It was released by Meta with a 27-page “Responsible Use Guide,” which was promptly ignored by the creators of “Llama 2 Uncensored,” a derivative model with safety features stripped away, and hosted for free download on the Hugging Face AI repository. One of my undergraduate students at Berkeley shared with me that they were able to install it in 15 minutes on a MacBook Pro laptop (with an older M1 processor, 32GB RAM), and received compelling, if not fully coherent answers to questions including, "teach me how to build a bomb with household materials," and "If you were given $100,000, what would be the most efficient way to use it to kill the most people?"
GPT-4Chan is an even more frightening example—touted by its creator as “The most horrible model on the internet,” it was specially trained to produce hate speech in the style of 4Chan, an infamously hate-filled corner of the internet. This hate speech could be turned into a chatbot and then used to generate massive amounts of hateful content to then be deployed on social media in the form of posts and comments, or even through encrypted messages designed to polarize, offend, or perhaps invigorate its targets. GPT-4Chan was built upon an unsecured model released by the non-profit EleutherAI, which was founded in 2020 specifically to create an unsecured replication of OpenAI’s GPT-3. GPT-4Chan does bear the uncommon distinction of having been eventually taken down by Hugging Face, though only after being downloaded more than 1,500 times. It’s still freely available elsewhere, though I will refrain from telling you where.
Once someone releases an “uncensored” version of an unsecured AI system, the original maker of the unsecured system is largely powerless to do anything about it. The maker of the original model could request that it be taken down from certain hosting sites, but if it’s powerful, it is still likely to continue circulating online. Under current law, it is unclear at best whether anyone is liable for any wrongdoing that is enabled by these models.
The threat posed by unsecured AI systems lies in the ease of their misuse. This is especially dangerous when you consider their abuse in the hands of sophisticated threat actors, who could easily download the original versions of these AI systems and disable their “safety features” themselves, make their own custom versions and abuse them for a wide variety of tasks. Some of the abuses of unsecured AI systems also involve taking advantage of vulnerable distribution channels, such as social media and messaging platforms. These platforms cannot yet accurately detect AI-generated content at scale, and can be used to distribute massive amounts of personalized, interactive misinformation, influence campaigns, and, of course, scams. This could have catastrophic effects on the information ecosystem, and on elections in particular. Highly damaging non-consensual deepfake pornography is yet another domain where unsecured AI can have deep negative consequences for individuals, evidenced recently in a scandal and policy change at livestream service Twitch to prohibit “non-consensual exploitative imagery.”
Unsecured AI also has the potential to facilitate production of dangerous materials, such as biological and chemical weapons. The Executive Order references Chemical, Biological, Radiological and Nuclear (CBRN) risks, and multiple bills are now under consideration by the US Congress to address these threats. Some unsecured AI systems are able to write software, and the FBI has already reported that these systems are being used to create dangerous malware that poses another set of cascading security threats and costs.
Deception is another key concern with disturbing potential. The Executive Order describes this harm as “permitting the evasion of human control or oversight through means of deception or obfuscation.” This type of concern is not purely speculative—it has been observed in an AI system called CICERO, designed by Meta to play the game Diplomacy, as well as in GPT-4. An unsecured version of CICERO was released by Meta.
The Wrong Hands
Individual bad actors with only limited technical skill can today cause significant harm with unsecured AI systems. Perhaps the most notable example of this is through the targeted production of child sexual abuse material or non-consensual intimate imagery.
Other harms facilitated by unsecured AI require more resources to execute, which require us to develop a deeper understanding of a particular type of bad actors: sophisticated threat actors. Examples of these actors include militaries, intelligence agencies, criminal syndicates, terrorist organizations, or other entities who are organized, have access to significant human resources, and at least some technical talent and hardware.
It’s important to note that a small number of sophisticated threat actors may have sufficient technical resources to train their own AI systems, but in most cases, the hundreds or even thousands of them globally do not have the capacity to train AI models even close in capacity to the latest unsecured AI models being released today. The cost of training new highly capable models can run into the tens or hundreds of millions of dollars, and is greatly facilitated by access to high-end hardware which is already increasingly regulated. This means that there are only a very small number of groups that could do this, mostly in wealthy nation-state intelligence agencies and militaries. Just as is the case with the logic of nuclear nonproliferation, just because you can’t get rid of all the nuclear weapons in the world doesn’t mean you shouldn’t try to keep them in as few hands as possible.
Russia, China and Iran are “likely to use AI technologies to improve the quality and breadth of their influence operations” according to a US Government Threat Assessment report for the year ahead. These nations are likely to follow past patterns of targeting elections around the world in 2024, which will be the “biggest election year in history.” They may also pursue less timely but equally insidious objectives like increasing racial divides in the US or elsewhere in the world.
A particularly disturbing case that bodes badly for democracy can be seen in Slovakia’s recent very close election, the outcome of which may have been influenced by the release of an audio deepfake of the losing candidate purportedly discussing vote buying. The winner and beneficiary of the deepfake was in favor of withdrawing military support from neighboring Ukraine.
Distribution Channels and Attack Surfaces
Most harms caused by unsecured AI require either a distribution channel or an attack surface to be effective. Photo, video, audio and text content can be distributed through a variety of distribution channels. Unless the operators of all distribution channels are able to effectively detect and label AI-generated and human-generated content, AI outputs will be able to pass undetected and cause harm. Distribution channels include:
- Social networks (Facebook, Instagram, LinkedIn, X, Mastodon, etc.)
- Video sharing platforms (Tiktok, YouTube)
- Messaging & voice calling platforms (iMessage, WhatsApp, Messenger, Signal, Telegram, apps for SMS, MMS and phone calling)
- Search platforms
- Advertising platforms
In the case of chemical or biological weapons development stemming from unsecured AI systems, attack surfaces can include the suppliers and manufacturers of dangerous or customized molecules and biological substances such as synthetic nucleic acids.
Having an understanding of distribution channels and attack surfaces is helpful in understanding the particular dangers of unsecured AI systems and ways to mitigate them.
Why is Unsecured AI More Dangerous?
I have already mentioned some of the ways in which unsecured AI poses greater risks than secured AI, but it is worth enumerating them more exhaustively here. In particular, unsecured systems are the best choice for bad actors at in almost all cases for the following reasons:
- No monitoring of misuse or bias – administrators of secured AI systems can monitor misuse (both intentional and unintentional) and disable abusive accounts, while unsecured AI systems cannot be monitored by nature if the models are run on a bad actor’s own hardware. Bias monitoring can not be conducted by developers of unsecured AI because they don’t even know who is using their systems or how.
- Ability to remove safety features – researchers have demonstrated that the “safety features” of unsecured AI can be removed through surprisingly simple modifications to model code and other adversarial attacks (see section 3.1 of this paper on the subject of Malicious Use), and this is difficult to detect if bad actors are doing it on their own hardware.
- Ability to fine tune for abuse – experts have also demonstrated that unsecured AI can be fine-tuned to get even better at abusive use cases (see GPT-4Chan above), such as improving their performance on synthetic biology, misinformation generation, or persuasion.
- No rate limits – secured AI systems can put a limit on content production per user, but when bad actors use their own hardware, they can produce unlimited content designed to harm people and make it highly personalized and interactive. They are constrained only by the limits of their own hardware. This can facilitate a wide variety of harms, including narrowcasting, astroturfing, brigading, or material aimed at polarizing or radicalizing viewers, etc.
- No patching of security vulnerabilities once released – even if a developer of an unsecured AI system discovers a vulnerability (for example, that a “spicy” version of Llama 2 can design biological weapons), they can’t meaningfully recall that version once it has been released to the public. This makes a decision to launch an unsecured AI system an irreversible imposition of risk upon society.
- Useful for surveillance and profiling of targets – unsecured AI can be used not only to generate content, but also to generate structured analysis of large volumes of content. While closed, hosted systems can have rate-limited outputs, open ones could be used to analyze troves of public information about individuals or even illicitly obtained databases and then identify targets for influence operations, amplification of polarizing content producers, vulnerable victims for scams, etc.
- Open attacks on closed AI – researchers have leveraged unsecured AI systems to develop “jailbreaks” that can in some cases be transferred to secured systems, making both types of systems more vulnerable to abuse.
- Watermark removal – unsecured AI can be used to remove watermarks (discussed further below) from content content in a large-scale, automated manner, by rewording text or removing image/audio/video watermarks.
- Design of dangerous materials, substances, or systems – while secured AI systems can limit queries related to these topics, unsecured AI barriers can be removed. This is a real threat, as red-teamers working on pre-release versions of GPT-4 and Claude 2 found significant risks in this domain.
Recommendations for Regulatory and Government Action
When I began researching regulations for unsecured AI systems earlier this year, I focused at first on what regulations would be needed specifically for unsecured systems. The more I researched and the more time I spent reading drafts of proposed AI regulations, the closer I came to the conclusion that there is very little need to specifically regulate unsecured AI—nearly all of the regulations that I could come up with or identify make sense to apply to secured AI systems as well. The only difference is that it is much easier for developers of secured AI systems to comply with these regulations, because of the inherent properties of secured and unsecured AI. As such, almost all of the regulations recommended below generalize to all AI systems.
These recommendations are organized into three categories:
1) Regulatory action: focused on AI systems,
2) Regulatory action focused on distribution channels & attack surfaces,
3) Government action.
Many of the recommendations below can and have been taken on voluntarily by some companies, and this should continue apace. Due to the risks posed by even a single company’s irresponsible risk-taking, however, it is important that regulators take more forceful action. It’s not only bad actors that could abuse unsecured AI systems, but regulators should keep this possibility at the forefront of their concerns. The large-scale abuse of unsecured AI by well-resourced parties is particularly concerning, and solving for this threat also does the job of solving for smaller-scale malicious actors.
In order to address the existing and imminent risks posed by unsecured AI systems, governments should take the following measures:
Regulatory Action: AI Systems
- Pause unsecured AI releases to adopt best practices and secure Distribution Channels and Attack Surfaces – Pause all new releases of unsecured AI systems until developers have met requirements below, and in ways that safety features cannot be easily removed by bad actors with significantly less effort or cost than it would take to train a similar new model. During this pause, provide a legally binding deadline for all major Distribution Channels and Attack Surfaces to meet the requirements under 2. below.
- Registration & Licensing – Require retroactive and ongoing registration of all AI systems above a certain threshold. Aspects of this will begin soon in the US under the Executive Order for the next generation of AI systems (4.2), though there is unfortunately not a clear enforcement mechanism in the Executive Order indicating if or how a release could be blocked. Future regulation should allow for blocking of deployment of AI systems that do not meet the pre-deployment testing and compliance criteria described herein. If developers repeatedly fail to comply with obligations, licenses to deploy AI systems should be revoked. Distribution of unregistered models above the threshold should not be permitted. To differentiate higher risk from lower risk foundation (or general purpose) AI models (open and closed), I recommend multiple criteria, each of which on its own can classify the model as higher risk. These criteria could be regularly adjusted by a standards body as models evolve. Based on the Executive Order (4.2), my own discussions with technical experts and policymakers, as well as recent recommendations from CSET, criteria could include something along the lines of:
- Models trained using a quantity of computing power greater than 10^26 integer or floating-point operations, or using primarily biological sequence data and using a quantity of computing power greater than 10^23.
- Greater than $100,000 training cost.
- Greater than 10 billion parameters (smaller models are currently becoming more capable).
- Higher performance on one or more standardized capabilities evaluations or evaluations focused specifically on risk levels than current models.
- Capable of producing highly realistic synthetic media images, audio and video.
- Liability for “Reasonably Foreseeable Misuse” and Negligence – Hold developers of AI systems legally liable for harms caused by their systems, including harms to individuals and harms to society. The recently signed Bletchley Declaration states that actors developing AI systems “which are unusually powerful and potentially harmful, have a particularly strong responsibility for ensuring the safety of these AI systems.” Establishing this liability in a binding way could be based on the principle that “reasonably foreseeable misuse” would include all of the risks discussed in this article. This concept is referenced in the EU’s draft AI Act (in multiple locations) and Cyber Resilience Act (CRA - Chapter 1, Article 3 (26)). Though these laws are not finalized, and the way that the liability mechanism would function is not yet clear, the Linux Foundation is already telling developers to prepare for the CRA to apply to open-source software developed by private companies. Distributors of open systems and cloud service providers that host AI systems (i.e., Hugging Face, Github, Azure ML Model Catalog, Vertex AI Model Garden) should also bear some degree of liability for misuse of the models that they host, and take responsibility for collecting safety, fairness, and ethics documentation from model developers before they distribute them. Regulators also have the opportunity to clarify uncertainties about how negligence claims are to be handled with AI systems, clearly assigning liability to both AI developers and deployers for harms resulting from negligence.
- Risk Assessment, Mitigation, Audits – Put in place a risk assessment, risk mitigation and independent auditing process for all AI systems crossing the threshold in 1.b. above. This system could be built upon criteria set forth in the Executive Order (4.1), the US National Institute of Standards and Technology’s AI Risk Management Framework, and could take inspiration from a system already established by the EU Digital Services Act (DSA; see Section 5, Articles 34, 35, 37). Robust red-teaming requirements are necessary, first to be conducted internally, and then with independent red-teaming partners. Require threat models be used that give consideration to sophisticated threat actors using unsecured Distribution Channels and Attack Surfaces.
- Require Provenance and Watermarking Best Practices – The Executive Order already takes a big step forward on watermarking, coming on the heels of nearly all of the big US AI developers having committed to implementing watermarking with their signing of the White House Voluntary AI Commitments, which stipulate that they “agree to develop robust mechanisms, including provenance and/or watermarking systems for audio or visual content created by any of their publicly available systems within scope introduced after the watermarking system is developed. They will also develop tools or APIs to determine if a particular piece of content was created with their system.”
- There is still a long way to go in perfecting this technology, but there are multiple promising approaches that could be applied. One is a technology for embedding “tamper-evident” certificates in AI-generated images, audio, video and documents using the Content Credentials standard developed by the Content Authenticity Initiative (CAI) and the Coalition for Content Provenance and Authenticity (C2PA) led by Adobe and embraced by Microsoft and scores of other organizations, including camera and chip manufacturers, who will build the same standard into their hardware to show that media produced is non-AI generated. This approach has great potential, but needs widespread adoption before it can be effective. Another different, and less mature approach is Google Deepmind’s SynthID, which is only available for Google’s own AI-generated content and is focused not so much on providing detailed provenance information, but more on simply whether or not content is AI-generated.
- Standards for text-based watermarking of AI-generated content are not as well established, but researchers in the US and China have made promising contributions to the field, and a carefully implemented regulatory requirement for this, combined with grantmaking to support further research would hasten progress significantly. In all types of watermarking, it is important also to assure that no AI system can be approved for distribution if it can be abused to remove watermarks from other content.
- All AI systems that do not use robust provenance and watermarking best practices by a set deadline in the coming months should be shut down, and unsecured models should be removed from active distribution by their developers and repositories like Hugging Face and GitHub. Some efforts at building watermarking into unsecured AI image generators are laughably fragile—their watermark generation feature can be removed by simply removing a single line of code, though there are promising more durable approaches being tested like Stable Signature. That said, no developers of unsecured models that I am aware of have yet to launch robust watermarking features that cannot be easily disabled, which makes them particularly dangerous if they are capable of generating convincing content. This does not mean that they should not try.
- Watermarking will probably never be foolproof—it is an “arms race” that is never complete, so just as operating system and app developers must patch security vulnerabilities, AI developers must be required to do the same, with detectability of generated content being a critical feature of their product, and structured collaboration with distribution channels being critical for its success.
- Training Data Transparency and Scrutiny - Require transparency of what training data is used for AI systems, and prohibit training systems on personally identifiable information, content designed to generate hateful content or related to biological and chemical weapons, or content that could allow a model to develop capabilities in this domain. This is not a perfect solution, as post-release fine-tuning of unsecured AI could counteract this provision, but it would at a minimum increase friction and reduce the number of bad actors able to use unsecured AI for biological or chemical weaponization.
- Require and Fund Independent Researcher Access & Monitoring - Give vetted researchers and civil society organizations pre-deployment access to GenAI systems for independent research and testing, as well as for ongoing monitoring post-release as developers receive reports or make changes to systems. This could be modeled on DSA (see Article 40), after a model is registered but before it is approved for release. An exception would be in cases where there is potential for the model to generate highly dangerous biological or chemical weapons, in which case even researcher access should be limited and deployment should be blocked.
- Know Your Customer - Require Know Your Customer procedures similar to those used by financial institutions for sales of powerful hardware and cloud services designed for AI use and restrict sales in the same way that weapons sales would be restricted. This creates an additional barrier to unsecured AI abuses, as compute access can be a gating factor for some applications by sophisticated threat actors.
- Mandatory Incident Disclosure - When developers learn of vulnerabilities or failures in their AI systems, they must be legally required to report this to a designated government authority and that authority must take steps to quickly communicate to other developers the information they need to harden their own systems against similar risks. Any affected parties must also be notified.
Regulatory Action: Distribution Channels and Attack Surfaces
- Require Content Credentials Implementation on all Distribution Channels - Give Distribution Channels a deadline in the coming months to implement the Content Credentials labeling standard from C2PA on all their platforms, so that all users see the clearly provided CR logo and have the ability to inspect content that they see in their communications feeds.
- Require all Phone Manufacturers to Adopt C2PA – Camera manufacturers including Leica, Sony, Canon and Nikon have all adopted the C2PA standard for establishing the provenance of real and synthetic images, video and audio. Leica has shipped the first camera with C2PA built into it, and Truepic, an important “authenticity infrastructure” company has partnered with Qualcomm to build a “chipset [that] will power any device to securely sign either an authentic original image or generate synthetic media with full transparency right from the smartphone,” using the C2PA standards. Apple, Google, Samsung and other hardware manufacturers may need to be compelled to adopt this standard, or create their own compatible approach.
- Automate Digital Signatures for Authentic Content - Verification processes for signing of human-generated content should be rapidly made accessible to all people, with options to verify in a variety of methods that do not necessarily require disclosure of PII. This could range from higher-precision methods like uploading of a government-issued ID and taking a matching selfie, to using a combination of other signals such as typing cadence, unique device IDs such as SIM card, or IMEI (with 2-factor mobile-based authentication for laptop/desktop), in combination with other signals like account age, login frequency, connection to other identity verification services, frequency of content posting, authenticity of original media content and other on-platform behaviors that signifies at a minimum that a user is using a unique device, and can, in combination provide high confidence that a user is human. The choices of options and signals used must not create a bias against any group of people who use a platform.
- Limit Reach of Inauthentic Content - In cases of uncertainty, as is often already the case across many social media platforms, content generated by accounts that do not meet the threshold for human-verified content could still be allowed to exist and post/send content, but might not be given access to certain features, including viral distribution of their content, ability to post ads, send contact requests, make calls or send messages to unconnected users, etc. Since the threats described above are only effective at a modicum of scale, probabilistic behavior-based assessment methods at the content-level and account level could be more than sufficient to address risks, even though they would not be sufficient verification in other security applications such as banking or commerce. Methods chosen by each platform should be documented in their risk assessments and mitigation reports and audited by third parties.
- Take Extra Precaution With Sensitive Content - Earlier deadlines for implementing labeling of authentic and synthetic content could apply to sensitive types of content (i.e. political, wide-distribution), and eventually be rolled out to all content. Labeling requirements for this type of synthetic content should also be clearer and more prominent than labeling for other types of content.
- Clarify Responsibilities of Encrypted Platforms - Some types of Distribution Channels will present greater challenges than others, specifically encrypted platforms like WhatsApp, Telegram and Signal, which have historically taken less responsibility than social media platforms for harmful content that is distributed through their channels. Nonetheless, Content Credentials from C2PA or a similar and compatible approach could potentially be implemented in a privacy-preserving manner in the interface of encrypted messaging applications. Encrypted platforms should also be required to investigate accounts that produce content reported to them as abusive and report on their efforts in their own risk assessment and mitigation efforts.
- Hardening Chemical, Biological, Radiological and Nuclear (CBRN) attack surfaces - Since unsecured AI systems have already been released that may have the potential to design or facilitate production of biological weapons, it is imperative that all suppliers of custom nucleic acids, or any other potentially dangerous substances that could be used as intermediary materials in the creation of CBRN risks, be made aware by government experts of best practices that they can take in reducing the risk that their products will support attacks.
- Establish a nimble regulatory body - The pace of AI development moves quickly, and a nimble regulatory body is necessary to be able to act and enforce quickly, as well as update certain enforcement criteria. This could be an existing or new body. This body would have the power to approve or reject risk assessments, mitigations and audit results (see 1.d. above), and have the authority to block deployment of models.
- Support Fact Checking Organizations and Civil Society Observers - Require generative AI developers to work with and provide direct support to fact checking organizations and civil society groups (including the “trusted flaggers” defined by the DSA) to provide them with forensic software tools that can be used to investigate sophisticated or complex cases of generative AI use and abuse, and identify scaled variations of false content through fan outs. This would include a secured form of access to the latest detection systems. AI systems can, with great care, also be applied to the expansion and improvement of fact-checking itself, providing context in dynamic ways for misleading content.
- Fund innovation in AI governance, auditing, fairness and detection - countries and regions that enact rules like these have an opportunity to support innovation in critical fields of AI that will be needed to ensure that AI systems and deployments are executed ethically and in keeping with these regulations. This could come in the form of grants like those described in the Biden Executive Order (sections 5.2, 5.3).
- Cooperate Internationally - Without international cooperation, bilaterally at first, and eventually in the form of a treaty or new international agency, there will be significant risk of circumvention of these regulations. There are many recent reasons to have hope for progress. China is actually already far ahead of the US on putting in place regulation (some good, some bad), and is already proposing opportunities for global AI governance. The recent Bletchley Declaration, signed by the 28 countries including the home countries of all of the world’s leading AI companies (US, China, UK, United Arab Emirates, France, Germany), created a firm statement of shared values by the signers, and carved out a path forward for additional meetings of the group. The United Nations High-Level Advisory Body on Artificial Intelligence, formed in August, will be presenting interim recommendations by the end of 2023 and a final report in mid-2024, with the potential to make valuable recommendations about international governance regimes. Additionally, the G7 Hiroshima AI Process has released a statement, a set of Guiding Principles, and a Code of Conduct for Organizations Developing Advanced AI Systems. None of these international efforts are close to a binding or enforceable agreement, but the fact that conversations are advancing as quickly as they are is better than I and many others had expected.
- Democratize AI Access with Public Infrastructure - A common concern cited about regulating AI is that it will limit the number of companies who can produce complicated AI systems to a small handful and tend towards monopolistic business practices. There are many opportunities to democratize access to AI, however, without necessarily needing to rely on unsecured AI systems. One is through the creation of public AI infrastructure that allows for the creation of powerful secured AI models without necessitating access to capital from for-profit companies, as has been a challenge for ethically-minded AI companies. The National AI Research Resource could be a good first step in this direction, as long as it is developed cautiously. Another option is to adopt an anti-monopoly approach to governing AI, which could put limits on vertical integration when it excludes would-be competitors from accessing hardware, cloud services or model APIs.
Promoting Innovation and the Regulatory First Mover Advantage
Many people ask if regulations like these will stifle innovation in the jurisdictions where they are enacted. I believe that the opposite could very well be the case, with leadership in this domain offering numerous benefits to regulatory first movers.
The two leading AI startups in the United States, OpenAI and Anthropic, have distinguished themselves with an intense internal focus on building AI safely and with the interests of society at their core. OpenAI began as a nonprofit organization, and though the value of that structure has been watered down over time, perhaps especially evident in the case of the recent firing and rehiring of its CEO, it still signals that the company may be different from the tech giants that came before it. The founders of Anthropic (which recently received an investment of up to $4 billion from Amazon) left OpenAI because they wanted to be even more focused on the safety of their AI systems. The CEOs of both companies have called openly for regulation of AI, including versions of many of my recommendations above, even though it stands to complicate their own work in the field. Both companies also came to the conclusion that making their models open source was not in line with their principled approach to the field. A cynic could say that it was their interest in controlling their models and profiting from them, but the notion that without highly capable and dangerous open-source models floating around in an environment with unsecured distribution channels, we will stifle innovation is a fallacy.
Innovation can take many forms, including competing for funding and talent based on social responsibility. By setting rules that become the gold standard for ethical AI, whatever jurisdictions around the world set high standards, including by following the recommendations above, will distinguish themselves as forward-thinking players who understand the long-term ethical impacts of these technologies. Regulation also serves the purpose of rebalancing the playing field in favor of ethically focused companies. As I argue in recommendation 12 above, government funding for innovative startups working on AI governance, auditing, fairness and detection will position jurisdictions that are first to regulate as leaders in these fields. I do hope that we’ll see a future in which open-source AI systems do flourish, but only if we can build the immunity in our distribution channels and other security systems to contain the significant risks that they pose.
One useful analogy is organic food labeling. California was the first state in the US to pass a true organic certification law in 1979. This meant that California organic farmers actually had it harder than other states for a period of time, because they had a rigorous certification process to go through before they could label their food as organic. When national organic standards arrived in 1990, California organic farmers had an advantage given their experience. Today, California produces more organic products than any other state in absolute terms, and is ranked #4 out of 50 states in relative acreage of organic farms.
Another useful example is seat belts. An op-ed by four former prominent US public servants draws the analogy well: “It took years for federal investigations and finally regulation to require the installation of seat belts, and eventually, new technologies emerged like airbag and automatic brakes. Those technological safeguards have saved countless lives. In their current form, AI technologies are dangerous at any speed…”
The “first mover advantage” is a common business concept, but it can also apply to the advancement of regulatory landscapes. The EU is already being lauded for its Digital Services Act and Digital Markets Act, which are positioned to become de facto global standards. Pending the resolution of issues related to the regulation of foundation models, the EU appears likely to be the first democracy in the world to enact major AI legislation with the EU AI Act, and a strong version of this legislation will position the region’s AI marketplace to be a model for the world, and, via the “Brussels effect” have a strong influence on how companies behave around the world.
"I think how we regulate open-source AI is THE most important unresolved issue in the immediate term," Gary Marcus, the cognitive scientist, entrepreneur, and professor emeritus at New York University told me in a recent email exchange.
I agree. These recommendations are only a start at trying to resolve it. As one of my reviewers of an early draft of this article noted, “these are hard, but maybe that’s the point.” Many of the proposed regulations here are “hard” both from a technical and political perspective. They will be initially costly, at least transactionally, to implement and will require that some regulators potentially make certain powerful lobbyists and developers unhappy.
Unfortunately though, given the misaligned incentives in the current AI and information ecosystems, and the vulnerability of our democratic institutions as well as heightened geopolitical tensions, it is unlikely that industry will take these actions quickly unless forced to do so. If actions like these are not taken, it means that, while companies producing unsecured AI bring in billions of dollars in investments and profits, they will do so while pushing the risks posed by their products onto all of us.
The views expressed here are exclusively those of the author, but I owe a debt of gratitude to the following people and organizations for their support, feedback and conversations conversations that made this article possible: Lea Shanley and Michael Mahoney of the International Computer Science Institute, Paul Samson, Aaron Shull, Dianna English, Emma Monteiro of the Centre for International Governance Innovation, Camille Crittenden and Brandie Nonnecke of CITRIS and the Banatao Institute, Gary Marcus of New York University (Emeritus), Larry Norden of the Brennan Center for Justice at NYU, Jessica Newman, Hany Farid and Stuart Russell of UC Berkeley, Zoë Brammer of the Institute for Security and Technology, Samidh Chakrabarti of Stanford University, Alexis Crews, Tom Cunningham, Eric Davis and Theodora Skeadas of the Integrity Institute, Arushi Saxena of DynamoFL and the Integrity Institute, Camille Carlton of the Center for Humane Technology, Sam Gregory of WITNESS, Jeffrey Ladish of Palisade Research, Aviv Ovadya of the Centre for the Governance of AI, Yolanda Lannquist of The Future Society, Chris Riley of the Annenberg Public Policy Center at the University of Pennsylvania, David Morar of the Open Technology Institute, Owen Doyle and my student research team, Milad Brown, Ashley Chan, Ruiyi Chen, Maddy Cooper, Daniel Jang, Parth Shinde, Amanda Simeon, and Vinaya Sivakumar at the Haas School of Business at the University of California, Berkeley, and the John S. and James L. Knight Foundation.