Judge Trims Copyright Lawsuit Against AI Model Stable Diffusion
A federal judge in California streamlined a class action brought by three artists who accuse Stability AI, DeviantArt, and Midjourney of unlawfully using their copyrighted works to train the image-generating AI model, Stable Diffusion. While dismissing most claims without prejudice (meaning they can replead their claims), U.S. District Judge William Orrick has let the core claim of copyright infringement against Stability AI stand for now.
What does this mean? Why does this matter?
We wrote back in January 2023 to expect a slew of lawsuits by those who create content — artists, authors, actors, playwrights, musicians, etc. — against companies who may have used their copyrighted works to train generative AI models. Looks like we were right. Content creators ranging from author George R.R. Martin of Game of Thrones fame to actor/comedian Sarah Silverman have brought copyright claims against various major AI players such as Stability AI, OpenAI (maker of ChatGPT), and Meta (maker of LLaMA). Judge Orrick's ruling may give them reason to be optimistic about their chances of success.
Let's break down what the companies are alleged to have done, the court's decision on the defendant's motion to dismiss, and address the decision's potential significance.
Andersen v. Stability AI, Ltd.
The plaintiffs' core allegations are that in August 2022, Stability created and released Stable Diffusion under an open-source license, which DeviantArt and Midjourney used to some extent in their own products. To train Stable Diffusion, Stability obtained more than five billion images from an organization, Large-Scale Artificial Intelligence Open Network (LAION), that LAION pulled from the internet. While future versions of Stability Diffusion are supposed to be trained on fully licensed images, the current version is not.
The plaintiffs concede that specific images generated by Stability Diffusion are unlikely to bear a close match to their original content. Instead, they claim that because the AI model was trained, in part, on their original content, any images it generates are “derivative works" that infringe their copyrights.
AI Training Infringement Claims Survive
Judge Orrick's decision begins with some housecleaning. He dismisses copyright infringement claims brought by the two artists who failed to register their copyrights with the Copyright Office. Then he limited the claims brought by the third artist to only those images she had in fact registered. A gentle reminder to readers: if you want to sue for infringement, you need to register your copyright.
As to the AI training claim, Judge Orrick ruled that the artist had plausibly alleged that her entire collection had been scrubbed from the internet and included in the LAION dataset.
Recall that on a motion to dismiss, the court must draw reasonable inferences in favor of the plaintiff and determine whether she has alleged “plausible" claims under the law. Although the artist did not identify which specific images were included in the dataset, she relied on the output of a search on the website, “ihavebeentrained.com," which suggested that at least some of her content had been used. Coupled with the allegation that five billion of the training images had been scrubbed from the internet, the court found reasonable the inference that all of her registered works were scraped into the AI training datasets.
The defendants had challenged the plaintiff's reliance on searches from “ihavebeentrained.com" because many images resulting from the search of the site were not associated with specific artists. The court rejected this argument, finding that the parties could sort out precisely what may have been used during discovery.
Judge Orrick did dismiss all other claims asserted by the plaintiffs, including vicarious infringement and unfair competition, among others. But he gave the plaintiffs leave to replead them in an amended complaint, and essentially provided them with an outline of how the plaintiffs could avoid dismissal in the future.
The First Decision of Others?
Why is this case news? It appears to be the first in the country in which a federal judge has suggested that you can't train AI on someone's copyrighted works without their permission. In response to a request for comments by the U.S. Patent and Trademark Office, OpenAI, the creator of ChatGPT, argued that intellectual property law's fair use doctrine permitted training on copyright works. As we wrote in January, we disagreed with that argument, so score one for us — and content creators — for now.
This decision is huge and should provide an incentive for all content creators to register their copyrights. It's not hard — you can do it online. This is particularly important if you want to sue for training infringement.
In the meantime, there are several other similar copyright infringement cases, including a couple of class actions, pending against other providers of AI. In the one Sarah Silverman and others brought against Meta, Meta did not move to dismiss the central allegation that training on content without the creator's permission can constitute copyright infringement. But that doesn't mean they won't challenge it later in the case. Assuming Congress doesn't intervene, we can expect to see this AI training-infringement issue percolate in the lower courts until it becomes necessary for the Supreme Court to address it.
The case is Andersen v. Stability AI, Ltd.
Related Resources:
- Bumpy Road Ahead for All in Adoption of AI in the Legal Industry (FindLaw's Practice of Law)
- The HAL 9000 Lawyer (FindLaw's Don't Judge Me Podcast)
- Art Forgeries Revealed by Artificial Intelligence (FindLaw's Practice of Law)