Washington Post and New York Post Sue AI for Copyright and Trade Dilution
The legal battle between original content creators and AI companies is nothing new. Last year, respected novelists and authors filed a lawsuit against OpenAI and Meta for allegedly using their copyrighted works to train AI models like ChatGPT and LLaMA. But of course, LLMs don’t limit their sources to fiction — there’s a lot of journalism out there that feeds chatbots as well.
So, it was only a matter of time before companies that own newspapers took their complaints to court. Another AI company you may not have heard of, Perplexity, is now facing a lawsuit from Dow Jones and the New York Post Holdings, alleging copyright and trademark violations.
Dow and NYP
You’ve heard of Dow Jones but may not associate it with much outside of Wall Street. The name is most famously associated with its Dow Jones Industrial Average. This stock market index tracks 30 major publicly-owned companies traded on the New York Stock Exchange and NASDAQ.
But you might not know that Dow Jones & Company is a household name in American journalism due to the prominence of its flagship publication, the Wall Street Journal (along with other reputable outlets like Dow Jones Newswires and Barron’s). The WSJ is a trusted business and financial news source, having won 39 Pulitzer Prizes for its investigative journalism and reporting.
NYP Holdings is similarly large in journalism. It publishes the New York Post, America’s oldest continuously published newspaper, established in 1801 by Alexander Hamilton. It ranks as the third-largest newspaper by print circulation in the United States.
These publications claim that their success is not only due to the stories they uncover but also how they are presented, requiring significant creativity and skill from their journalists and editors. Both companies rely on the talent and dedication of their journalists, who often work under challenging conditions to produce high-quality journalism. This includes risking their lives in war zones and facing arrest or prosecution to ensure the public is informed. The publications claim that their success is not only due to the stories they uncover but also how they are presented, requiring significant creativity and skill from their journalists and editors.
Perplexity and RAG
Perplexity AI is probably one of the many generative artificial intelligence companies you’ve never used. The company claims to provide users with accurate and up-to-date news and information through an AI platform that allows users to "Skip the Links" to original publishers’ websites.
This process involves creating a "retrieval-augmented generation" (RAG) index, which stores and utilizes the copied content to produce outputs that substitute the original news sources. The RAG index is like a powerful tool that helps machines understand and respond to human input in a more intelligent and human-like way. It's used in applications like chatbots, language translation, and text summarization.
Imagine you ask a question or give a prompt to a computer, and it needs to respond with a relevant answer or text. A RAG index is like a super-smart librarian that helps the computer find the right information to generate a good response. The RAG index quickly searches through a massive library of texts, articles, and documents to find the most relevant information related to your question or prompt.
The RAG index then uses this retrieved information to "augment" or enhance the computer's understanding of the topic. It's like the librarian highlighting important keywords and concepts in the book. Finally, the computer uses this augmented information to generate a response or text that answers your question or responds to your prompt. It's like the librarian writing a summary of the book based on the highlighted keywords.
Perplexing Problems
The problem? Well, for one, Perplexity isn’t getting permission to use this content. According to Dow Jones and NYP, Perplexity relies on copying a large amount of copyrighted content from publishers like them without authorization, using it to generate responses to user queries.
But separately, Perplexity isn’t always presenting accurate information. Like with many chatbots, it’s prone to “hallucinations.” This refers to when generative models produce false or fabricated information and present it as fact. This involves the AI creating fake sections of news stories and attributing them to real publishers, which can confuse readers and dilute trademarks.
This issue arises because language models predict words that seem correct based on prompts, but this process can lead to inaccuracies. As Matthew Sag, a professor of law and artificial intelligence at Emory University, highlights the inherent unpredictability in how these models generate content. "It is absolutely impossible to guarantee that a language model will not hallucinate," he says. “We only call it a hallucination if it doesn't match up with our reality, but the process is exactly the same whether we like the output or not.”
Well, you can probably see how these problems are no good for the news companies.
Publishers Rag on Perplexity
Dow Jones and NYP have filed a complaint in court against Perplexity, arguing that the AI’s practices harm their financial incentives to create original content, ultimately threatening the sustainability of their journalism.
The complaint claims that publications like the WSJ and Post rely heavily on a revenue model primarily driven by subscriptions, advertising, and content licensing. In the digital age, these publishers generate significant income from selling subscriptions to their digital publications. Additionally, they earn revenue from online advertising, which is displayed when consumers visit their websites directly or access them through search engines and other links. Another crucial revenue stream for the plaintiffs is the licensing of their content to third parties, including AI companies that legally use their high-quality journalism in various applications.
The news companies allege that Perplexity undermines these revenue streams by copying and reproducing their copyrighted content without authorization. By doing so, Perplexity diverts customers and crucial revenues away from the original publishers, as users are encouraged to "Skip the Links" to the original websites. This not only affects the direct traffic to the plaintiffs' sites, reducing their advertising revenue, but also impacts their ability to sell subscriptions and secure licensing deals. Perplexity has allegedly raised significant capital to develop its "answer engine," positioning itself as a competitor to traditional news outlets by offering a product that bypasses the need to visit original content creators' websites.
Copyright and Trademark Claims
The plaintiffs are suing for two counts of copyright infringement, both for inputs and outputs. Perplexity’s is first accused of copying the copyrighted works without authorization to create the RAG index, which the plaintiffs claim is a violation of the Copyright Act as it involves unauthorized reproduction of their content.
Separately, the outputs generated by Perplexity’s AI (which include verbatim reproductions, summaries, or paraphrases of Plaintiffs’ copyrighted content) are alleged to infringe on Plaintiffs' copyrights. These outputs act as substitutes for accessing the original content, diverting revenue from Plaintiffs. This is thus a separate violation of the Copyright Act.
Finally, the plaintiffs are also bringing a claim of “false designation of origin” and trademark dilution. They claim that Perplexity falsely attributed content to their fabricated content to Dow Jones and NYP’s trademarked publications. This misleads users into believing it is authentic and causes confusion, diluting the trademarks and thereby harming the publications’ brand reputations.
What to Expect
This lawsuit is only the latest in a long line of legal challenges between creators and AI companies, highlighted a growing tension over intellectual property rights. Some legal experts express doubts about the trademark claims, and the case is by no means a slam-dunk. But as Robert Thomson of News Corp emphasizes, there is a need to "challenge the content kleptocracy" to safeguard journalistic integrity. This lawsuit, even if unsuccessful, won’t be the last.
Related Resources:
- What Is the AI-focused COPIED Act? (FindLaw's Law and Daily Life)
- Can I Sue an Artificial Intelligence Company for AI Copyright Violations? (FindLaw's Learn About the Law)
- Music Labels Join Lawsuit Frenzy Against Artificial Intelligence (FindLaw's Legally Weird)