The Quest for Profit

Meta Sued by Major Publishers Over AI Training Data

May 6, 2026InTech
Share:
Article Feature

Five major publishing houses and author Scott Turow filed a class-action lawsuit against Meta and CEO Mark Zuckerberg, accusing the company of illegally using millions of copyrighted books and articles to train its artificial intelligence system, Llama.

The lawsuit was filed in federal court in Manhattan and includes publishers Elsevier, Cengage, Hachette Book Group, Macmillan, and McGraw Hill. The plaintiffs allege that Meta reproduced and distributed copyrighted works without permission or compensation and did so knowing it violated copyright law. The complaint claims Zuckerberg personally authorized and encouraged the practice as part of Meta’s AI development efforts.

The lawsuit opens another major legal battle between creators and AI companies over the use of copyrighted material for training artificial intelligence systems.

This fight is really about a simple question: can AI companies use people’s work to build billion-dollar technology without asking first?

Allegations of Piracy and Copyright Infringement

According to the lawsuit, Meta reportedly stole millions of copyrighted items from various websites, including 'notorious pirate sites,' and used them to train Llama without permission. The complaint states that the contents included textbooks, journal papers, novels, and scientific publications. The works listed include The Fifth Season and The Wild Robot.

Plaintiffs also accused Meta of deleting copyright management information to conceal the provenance of AI training materials. The lawsuit argues that Meta's AI system can duplicate versions of original works and, in some cases, generate practically verbatim portions, as well as imitate the writing styles of specific authors.

Role of Mark Zuckerberg and Meta’s Response

The lawsuit says that Zuckerberg is directly responsible because he personally approved the use of pirated material and didn't follow the usual licensing procedures. Plaintiffs argued that his close involvement in Meta’s AI strategy contributed to the company’s rapid growth in the AI sector.

Meta denied wrongdoing and said it plans to fight the lawsuit aggressively. A company spokesperson said AI is driving innovation, productivity, and creativity, while also arguing that courts have recognized AI training on copyrighted material can qualify as fair use. The company maintains that the use of copyrighted materials in AI training is legally protected under existing fair-use principles.

Growing Legal Battle Between AI and Creators

This case is part of a larger fight between AI developers and the creative industry. More and more, writers, publishers, news organizations, and artists are suing companies like Meta, OpenAI, and Anthropic for taking their work. Anthropic agreed to pay $1.5 billion last year to settle a class-action lawsuit brought by authors. This was one of the biggest copyright settlements ever related to the development of AI.

Debate Over Fair Use and AI Outputs

Legal experts say one of the key questions in these cases is whether AI systems transform copyrighted material enough to qualify as fair use under U.S. law. Courts have already issued differing rulings on the issue, leaving the legal landscape uncertain.

Professor Michael Goodyear of New York Law School said the strongest copyright cases are usually those where AI-generated outputs closely resemble original works. He did, however, say that this lawsuit is more about the training process than the final results.