Adobe is facing a class action lawsuit alleging that it used pirated books to train its artificial intelligence model, SlimLM.
Author and editor Elizabeth Lyon filed a lawsuit, alleging that Adobe incorporated copyrighted works into its AI training data without consent. The case alleges that Adobe relied on unauthorised material to develop SlimLM, a small language model designed for document assistance on mobile devices.
According to the complaint, Adobe trained the model using the SlimPajama dataset. SlimPajama forms part of the RedPajama dataset, which researchers have linked to large volumes of copyrighted content.
Who is Elizabeth Lyon?
Elizabeth Lyon is a veteran writing teacher and book editor. She has worked in the publishing industry since 1988. Lyon is best known for her instructional booklets on writing, editing, and marketing fiction and non-fiction.
Lyon claims that her authored works appear in the training data. She alleges that Adobe used those materials without permission or compensation. The lawsuit argues that this practice violates copyright law and harms authors.
Adobe sued for allegedly misusing authors' work in AI training https://t.co/H7UKoGYWPd https://t.co/H7UKoGYWPd
— Reuters (@Reuters) December 17, 2025
How the dataset was built
The filing states that SlimPajama originated from the RedPajama dataset. RedPajama includes a collection known as Books3. Books3 reportedly contains around 191,000 books, many of which remain under copyright.
The lawsuit claims Adobe copied and modified this dataset to train SlimLM. Lyon argues that this process amounts to large-scale copyright infringement.
Read: Adobe Premiere iOS App Launched: Free Video Editing with AI Features
The case against Adobe reflects a growing legal pushback against AI developers. Authors, artists, and publishers have accused major tech firms of using copyrighted material without approval.
Similar lawsuits have targeted Apple and Salesforce over alleged use of the same dataset. In another high-profile case, AI firm Anthropic agreed to pay $1.5 billion to authors who accused it of training its Claude chatbot on pirated works.
These cases highlight unresolved legal questions around AI training data and intellectual property rights. Courts now face pressure to clarify how companies may lawfully source materials for machine learning models. As AI adoption accelerates, scrutiny over data ethics continues to intensify. The outcome of the Adobe case could shape how AI companies build and train future models