The New York Times Wants ChatGPT Gone. Nice Try

The battle between copyright holders and generative AI companies is heating up, and The New York Times is leading the charge.

The publication recently filed a lawsuit against OpenAI and Microsoft that claims copyright infringement, trademark dilution, and unfair competition. And it’s not pulling its punches. The suit seeks not just monetary compensation but also the destruction of all the defendant’s LLM models and training data, as well as a halt to unlicensed training on the publication’s articles.

“When you have these big technology shifts, the law has to adjust,” says Cecilia Ziniti, a Silicon Valley attorney. “This case is so historic because The New York Times has millions and millions of words that are used for training. So, extrapolating out, now is the time this is going to be regulated and get looked at.”

The showdown between copyright holders and AI companies intensifies

The New York Times’s lawsuit against OpenAI and Microsoft is the latest in a string of complaints against generative AI companies. Getty Images filed suit against StabilityAI, creator of the image generation tool Stable Diffusion, in early 2023, and several music publishers filed suit against Anthropic, creator of Claude.ai, in October.

But The New York Times’s suit is notable for its scope. It accuses the defendants of “copying and using millions of The Times’s copyrighted [articles].” The claim is supported by 100 examples of ChatGPT reproducing near-exact copy from New York Times articles.

“Whenever you have a verbatim copy, that’s a replacement, and that’s going to be pretty colorable [plausible to the court],” says Ziniti. “The New York Times also has enough of a library, going back to 1851, that they can actually say some percentage of the training data was New York Times.”

The New York Times’s lawsuit against OpenAI and Microsoft provides examples of ChatGPT producing text similar to the publication’s articles.The New York Times

Even so, the suit’s victory isn’t certain. Mike Masnick, founder and editor of the technology policy publication Techdirt, points out that prior cases, such as the Authors Guild lawsuit against Google Books, set a precedent that may protect the use of copyrighted data to train AI.

“I go back to the most important similar case, which is the Google Books case,” says Masnick. “[Google] scanned books in order to create a giant search engine of books. That was very much a commercial entity, for a commercial purpose…that involved scanning entire copyrighted books and building a massive index of all those works.” Google argued that Google Books was transformative fair use and prevailed.

And there’s yet another complication: recent agreements between OpenAI and other publishers, such as Axel Springer and the Associated Press. The exact terms of the deals are unknown, but a press release from OpenAI states its deal with Axel Springer will help the publisher summarize “selected global…

Read full article: The New York Times Wants ChatGPT Gone. Nice Try

The post “The New York Times Wants ChatGPT Gone. Nice Try” by Matthew S. Smith was published on 01/06/2024 by spectrum.ieee.org