Analysis

Anthropic Code Crisis Creates Copyright Contradiction

James Ball / Apr 14, 2026

James Ball is a fellow at Tech Policy Press.

The logo of Anthropic is displayed on a smartphone screen in front of another of the company's logos in Creteil, France, on March 27, 2026. (Photo by Samuel Boivin/NurPhoto via AP)

This can’t have been the start to April that Anthropic was hoping for. After a very public feud with the Pentagon that commanded headlines throughout the start of the year, its staff have had to expend huge effort trying to minimize the damage caused by what the company says was “human error” late in March—a leak of some of its source code.

The code in question was what is known as a “harness,” code which works to connect a foundational AI model to the application layer—handling matters like the user interface, connections to data sources and tools, and how the AI processes user prompts.

The harness is fundamental for addressing issues with the AI’s context window, safety and quality concerns, and for managing how an AI allocates resources to tackle a query. Anthropic’s harness is a vital component of its coding model, marketed prominently by the company as “the world’s best coding model.”

The exposed code did not contain user data, a breach of which could potentially have triggered regulatory investigations, nor the actual ‘weights’—the parameters inside a foundational AI model that determine its responses—but was arguably among the most sensitive material the company could otherwise have accidentally released.

Rival companies could, in theory, make use of the leaked code to assess it against their own codebase, potentially giving themselves a considerable advantage in closing the gap in coding performance of their models to Anthropic’s Claude, or even overtaking it.

Understandably, Anthropic has sought to minimize the impact of the unintended disclosure, and turned to US copyright law to do so—issuing a takedown notice under the Digital Millennium Copyright Act (DMCA) to GitHub in a bid to have its leaked code taken offline, or at least to restrict its circulation somewhat.

The company’s critics, though, have been swift to spot the irony: Anthropic, like most of the broader AI industry, has been embroiled in copyright lawsuits for several years—which have seen the company’s lawyers make some expansive claims about why copyright does not prevent the training or development of AI models.

For publishers, authors, and other creators, the schadenfreude might feel well-earned—but is it justified?

Bartz v. Anthropic

There is no intrinsic irony or hypocrisy in a company arguing that copyright law has not been breached in one context, but has in another: disputes of this nature are relatively common between music producers, artists, and companies, but few suggest defending one such suit would make them morally ineligible to pursue another.

Anthropic, though, has made some very specific claims about the role of pirated copyright material available on the internet being used to train and improve AI models, and it did so during the early arguments in the case Bartz v. Anthropic, a class action brought by authors disputing the use of their books to train Anthropic’s AI models.

The authors initially sought relief on two grounds. Firstly, they argued that it was not “fair use” under US copyright for Anthropic to buy copies of their books and then use those legally-purchased books for the purpose of training AI models. Secondly, they argued that Anthropic had downloaded “hundreds of thousands” of pirated books—a fact which the company did not dispute in court—and used these to create an internal library, which in turn has been used to help develop its AI models.

In a summary judgment in the Northern District of California, Judge William Alsup largely sided with Anthropic on the matter of fair use—saying that even if Anthropic made digital copies of books it bought for its own use, going on to use those for training was transformative, and rejecting the idea authors could use copyright to restrict that.

“Authors cannot rightly exclude anyone from using their works for training or learning as such. Everyone reads texts, too, then writes new texts,” Alsup wrote. “[T]o make anyone pay specifically for the use of a book each time they read it, each time they recall it from memory, each time they later draw upon it when writing new things in new ways would be unthinkable.”

This aspect of the summary judgment was a sizable win for Anthropic. The company argued that the pirated books it had downloaded should be treated in much the same way: if the eventual use to which they would have been put was transformative, and thus fair use, the fact they were pirated in the first place should be disregarded.

Judge Alsup disagreed. “Anthropic’s sweeping rule of law as proposed would come down to this,” he wrote. “An AI company is free to pirate copyrighted materials without ever accounting for the extent to which the pirated materials were ever actually and solely used for a fair use.”

He instead ruled that the matter should be decided by a jury at a full trial. “For all we know at this stage, Anthropic will persuade the jury to find facts vindicating it completely,” he concluded. “But if Anthropic loses big it will be because what it did wrong was also big.”

In practice, the matter never came before a jury, as Anthropic agreed a settlement deal with the plaintiffs, expected to cost the company around $1.5 billion and entailing a payout of around $3,000 per copyrighted work affected (Disclosure: the author of this article, who was not a party to the case, has two books that fall within the scope of the settlement, and so stands to benefit financially from it).

Prior to the settlement, which did not involve an admission of wrongdoing from Anthropic, its lawyers had argued that whether material useful to the development of its AI models had been pirated should be immaterial if they were then put to transformative use.

This does at least somewhat contradict its use of DMCA to restrict the spread of the code used for its harness—especially if Anthropic goes on to take action against any company attempting to make use of the leaked material to improve its own models. When it comes to code, copyright only protects code as written: if someone else changes the order of operations (in the same way a writer might reword a paragraph), copyright ceases to apply.

Looking at someone else’s code, figuring out how it works, and writing something similar yourself is generally not a breach of copyright. Protecting the actual function of a particular bit of code requires patent protection, which requires a much longer and more grueling process to obtain.

Anthropic’s critics, then, are not simply engaging in a cheap shot when they note Anthropic’s stance on copyright seems to have changed. The company has, by necessity, been forced to adopt close to the opposite position as its lawyers argued during Bartz v. Anthropic.

An additional wrinkle

Copyright nerds might be interested in an additional complication around Anthropic’s attempts to use copyright to protect its code: Anthropic has repeatedly and openly admitted it has used Claude Code to write code for Claude itself. This raises a question of whether such code can attract copyright at all, since copyright requires a human to be sufficiently involved in the creative process so as to be regarded as a work’s author.

The difficulty is that where the threshold lies for how much human input is sufficient to qualify for authorship is not settled law, especially when it comes to human-AI collaborations. Such cases are currently working their way through the court system—though the Supreme Court recently denied cert for one.

Historically, such cases often centered on photography. Burrow-Giles Lithographic Co. v. Sarony (1884) was a seminal case in this area, featuring a dispute between a photographer and a printing company over whether an artistic photographic portrait of Oscar Wilde could be copyrighted.

The printing company, which had distributed thousands of copies of the photo with no payment to the photographer, argued it could not, as the machine had done the work of creating the image. The court found that the composition and styling of the photograph were enough to qualify it as an authored work. More than a century later, this is the precedent several pending AI authorship cases are citing to claim authorship for AI-created works.

Thaler v. Perlmutter, the case the Supreme Court declined to hear last month, involved an attempt by computer scientist Dr. Stephen Thaler to register as copyrighted a work produced by AI, citing the AI itself as the author. This was rejected by the copyright office as the AI is not human, which was upheld by the court.

Dr. Thaler then tried to expand his claim with the copyright office to suggest he was the copyright owner through work-for-hire, but this failed again as the AI did not qualify as a human author that could be hired to produce a copyrighted work. He also introduced the argument that because he had “provided instructions and directed his AI,” he should be credited as the author—but this failed in both the district and circuit courts because he had not advanced the argument with the copyright office in the first instance.

Both courts rejected the AI authorship argument but the matter of setting a precedent for when human use of AI qualifies as authorship remains open.

One open case, Allen v. Perlmutter—in which the copyright office denied authorship to an AI-generated work produced after hundreds of individual prompts by the user—seems a likely candidate in which courts will finally have to tackle that matter.

In theory, Anthropic’s leaked code, produced in large part by Claude, could itself be a candidate to set such a precedent, though in practice Anthropic may decide it is not in its interests to aggressively pursue AI-related copyright cases through the courts.

The ongoing disputes around AI and creativity entering the courts are prompting inevitable calls for Congress to reconsider copyright and IP law more broadly, given that the basic framework of US copyright law is largely unchanged since 1976.

The DMCA produced a mechanism for accelerated dispute resolution in the internet era, but does nothing to tackle the challenges raised in the AI era. Reformers suggest a major overhaul is long overdue. Perhaps thanks to its recent experiences, Anthropic might even find itself in agreement with them—at least to an extent.

Support Tech Policy Press

If you've found our work helpful, consider supporting us.

Donate

Authors

James Ball

James Ball is a journalist and author who has written on technology and politics for 15 years. Over that time, he has worked on major international scoops including the Snowden disclosures, Panama Papers and offshore leaks, and Chelsea Manning’s document releases through WikiLeaks. He is a PhD candi...