Anthropic hits back at music publishers in AI copyright lawsuit

Anthropic, a major generative AI startup, laid out its case for why copyright infringement allegations by a group of music publishers and content owners are invalid in a new court filing on Wednesday.

In the fall of 2023, music publishers including Concord, Universal, and ABKCO filed a lawsuit against Anthropic, accusing it of copyright infringement on its chatbot Claude (now superseded by Claude 2).

The complaint, filed in federal court in Tennessee (one of America’s “Music Cities” and home to many labels and musicians), alleges that Anthropic’s business profits from “illegally” scraping song lyrics from the Internet to train its AI models that then reproduce the copyrighted texts for users in the form of chatbot responses.

In response to a request for a preliminary injunction — a measure that, if granted by the court, would force Anthropic to stop offering its Claude AI model — Anthropic made familiar arguments that have appeared in many other copyright disputes involving training data of AI.

Gen AI companies like OpenAI and Anthropic rely heavily on collecting vast amounts of publicly available data, including copyrighted works, to train their models, but they argue that this use constitutes fair use under the law. The issue of data erasure copyright is expected to reach the Supreme Court.

Song lyrics are just a “miniature particle” of training data

In its response, Anthropic argued that “plaintiffs’ use of the texts to teach Claude is a transformative use” that adds “an additional purpose or different character” to the original works.

In support of this, the paper directly quotes Anthropic’s director of research, Jared Kaplan, as saying that the goal is “to create a dataset that teaches a neural network how human language works.”

Anthropic argued that its conduct “did not have a “substantial adverse effect” on a legitimate market for plaintiffs’ copyrighted works,” noting that song lyrics make up a “miniscule portion” of the training data and licensing on the necessary scale is incompatible .

Joining OpenAI, Anthropic argues that licensing the vast amount of text required to properly train neural networks like Claude is technically and financially unfeasible. Training requires trillions of snippets across genres, it can be an unattainable scale to license for any country.

Perhaps the filing’s most recent argument alleges that the plaintiffs themselves, not Anthropic, engaged in the “willful conduct” required for direct tort liability with respect to the results.

“Willful conduct” in copyright law refers to the idea that a person accused of infringing must be shown to have control over the results of the infringing content. In this case, Anthropic is essentially saying that the label plaintiffs caused its AI model Claude to produce the infringing content and thus control and bear responsibility for the infringement they report, as opposed to Anthropic or its Claude product reacting to inputs to users autonomously.

The points to submit in proof of the results were generated by the plaintiffs’ own “attacks” against Claude, designed to provoke texts.

Irreparable harm?

In addition to disputing copyright liability, Anthropic argues that the plaintiffs cannot prove irreparable harm.

Citing a lack of evidence that song licensing revenue has declined since Claude’s launch, or that quality damages are “certain and immediate,” Anthropic pointed out that the publishers themselves believe monetary damages can cure them, contradicting their own their claims of “irreparable harm” (since, by definition, accepting monetary damages would mean that harms do have a price that can be quantified and paid for).

Anthropic argues that the “extraordinary relief” of an injunctive relief against it and its AI models is unwarranted given the plaintiffs’ weak showing of irreparable harm.

He argued that the music publishers’ request was overly broad, seeking to limit the use of not only the 500 representative works in the case, but also millions of others that the publishers further claimed to control.

In addition, the AI startup listed venue in Tennessee and claimed that the case was filed in the wrong jurisdiction. Anthropic claims it has no relevant business ties to Tennessee. The company noted that its headquarters and main operations are based in California.

Additionally, Anthropic said none of the alleged violations cited in the lawsuit, such as training its AI technology or providing user responses, occurred within Tennessee’s borders.

The filing states that users of Anthropic’s products have agreed to submit all disputes to California courts.

The copyright fight is far from over

The copyright battle in the booming generative AI industry continues to intensify.

More artists joined lawsuits against art generators like Midjourney and OpenAI with the latter’s DALL-E model, supporting evidence of infringement from diffusion model reconstructions.

New York Times recently filed a copyright infringement lawsuit against OpenAI and Microsoft, claiming that their use of scraped Times’ content for training models for ChatGPT and other AI systems violates copyright. The suit seeks billions in damages and requires any models or data to be trained times content to be destroyed.

Amid these debates, a nonprofit group called Fairly Trained launched this week advocating for “licensed model” certification for data used to train AI models. Platforms have also stepped in, with Anthropic, Google and OpenAI, as well as content companies such as Shutterstock and Adobe promising legal protections for corporate users of AI-generated content.

But the creators are adamant, fighting bids to dismiss claims by authors such as Sarah Silverman v. OpenAI. Judges will have to weigh technological progress and legal rights in nuanced disputes.

In addition, regulators are listening to concerns about the scope of the data. Court cases and congressional hearings can decide whether fair use protects misappropriation, frustrating some while enabling others. Overall, negotiations seem inevitable to satisfy all parties involved as generative AI matures.

What will come next remains unclear, but this week’s filing suggests that generative AI companies are rallying around a core set of fair use and harm-based defenses, forcing courts to weigh technological progress against the control of rights owners.

As VentureBeat previously reported, no copyright claimant has so far won a preliminary injunction in these types of AI disputes. Anthropic’s arguments are intended to ensure that precedent will continue, at least through this stage in one of the many ongoing legal battles. The endgame remains to be seen.

VentureBeat’s mission is to be a digital town square for technical decision makers to gain knowledge about transformative enterprise technologies and transactions. Discover our briefings.

Anthropic hits back at music publishers in AI copyright lawsuit

Song lyrics are just a “miniature particle” of training data

Irreparable harm?

The copyright fight is far from over

Related

Leave a Comment Cancel Reply