Copyright Meets the Colossus: What the Anthropic Settlement Means for the Future of AI

Bryce Peters — The recent settlement of the Bartz v. Anthropic case marks one of the first major legal inflection points in the AI era. The outcome is one that forces AI companies, authors, and the business community to face an increasingly unavoidable question: If large-language models depend on copyrighted works to function, who gets paid, who gets protected, and who bears the risk when the training data is unlawfully sourced?

The lawsuit, filed in August 2024 by authors Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson, alleged that Anthropic had built its Claude model in part by uploading millions of books pulled from “shadow libraries” (online databases where copyrighted materials can be illegally accessed and downloaded for free) such as Library Genesis and Pirate Library Mirror. The lawsuit, part of a larger ongoing wave of copyright litigation brought against tech companies over the data they use to train artificial intelligence programs, claimed that Anthropic’s use of their work to  train its large language models (LLMs) without permission was a violation of copyright law. Judge William Alsup’s decision in June 2025 drew an important line. He held that Anthropic’s training of an LLM using books that were lawfully purchased constituted transformative fair use because the company employed those works for a fundamentally different, non-expressive purpose. However, the court refused to extend that protection to pirated books, emphasizing that fair use does not authorize a company to maintain or exploit a library of illegally obtained works. This distinction exposed Anthropic to the possibility of being held liable for several billion dollars’ worth of statutory damages—enough to severely cripple or even put the company out of business.

By September of 2025, Anthropic chose to settle. The agreement—valued at roughly $1.5 billion for about 500,000 works—amounts to approximately $3,000 of compensation per book, making it one of the largest copyright-infringement settlement cases in U.S. history. The settlement only applies to past conduct and does not provide Anthropic with any continuing license to use author’s works going forward, nor does it preclude future training efforts from litigation. What it does provide, however, is a newly visible price tag for the risk of ingesting unlicensed copyright material during model development.

The settlement also poses a deeper policy question: Is it sustainable—or even fair—for AI companies to build gargantuan models on the backs of copyrighted materials with little to no compensation, so long as those works are lawfully acquired? While the fair-use holding affirms that lawful acquisition generally protects training on purchased books, the business implications are far more complex. Even if training on legitimately purchased texts qualifies as transformative use, there is nothing in the decision that requires authors to accept being uncompensated for the value their works provide to the AI economy.

Hence, the preeminence of the licensing model. Allowing authors to license their works to LLM companies aligns the interests of both parties: authors get paid and retain control over their intellectual property while AI developers gain legal certainty, reputational safety, and access to high quality data. The Bartz settlement suggests that authors possess real leverage, and that companies are willing to pay substantial sums of money when that leverage is exercised. For businesses, this indicates an emerging market for training data licenses that could mirror those of music or movie streaming services.

Simultaneously, a licensing-first approach carries its share of tradeoffs. If training data becomes expensive or overly gated, smaller AI startups could face insurmountable barriers to entry, thereby stifling competition and entrenching the largest firms. Developers could respond by gravitating towards public-domain or open-access materials, consequently reducing diversity in training data and ultimately degrade model performance. Furthermore, because training relies on ingesting millions of works, figuring out what a “fair price” is for each individual contribution to that training is a deeply non-obvious economic question. The Anthropic settlement provides a mere datapoint rather than a market structure.

Nevertheless, the broader trend is unmistakable: The era of AI companies feeling comfortable scraping whatever they can find, legal or otherwise, is rapidly concluding. The Bartz settlement emphasizes the importance of provenance, that authors are not passive stakeholders, and that fair use is not a conclusive defense. From a risk-management perspective, companies deploying or developing generative-AI systems will need to audit their data sources, scrutinize the legality of their acquisitions, and—where uncertainty exists—seek licenses rather than rely on legal ambiguities.

For authors and creatives, the takeaway is equally clear: Their works are increasingly treated as valuable computational inputs, not simply expressive outputs. Such a shift provides them a meaningful seat at the table through either negotiation, collective action, or regulation. For legal practitioners advising businesses in this space, the settlement signals the beginning of a more mature, contract-driven ecosystem surrounding training data, one prioritizing transparence, certainty, and rights-clearance.

In a multitude of ways, the Bartz case is less a conclusion than an opening chapter of a broader restructuring of the relationship between creative labor and artificial intelligence. The settlement affirms that innovation can exist while also respecting copyright protections, but only if companies are willing to respect the sources of the data powering their models. Whether the future leans more towards fair use or licensing will depend largely on how businesses adapt in the wake of this settlement. Yet, the message is clear: Creatives will no longer tolerate being intellectually and economically ravaged by the colossus of big tech, and they are willing to leverage their legal rights to ensure they are compensated.