The question of copyright is a tangled mess of laws and ethics. That much has been known ever since the concept of “intellectual property” was first introduced. But what happens when the entity that is plagiarizing isn’t even human? The courts will soon have to grapple with this issue as The New York Times joins an onslaught of creators suing OpenAI and Microsoft for copyright infringement.
In order to train their AI, OpenAI sources content from all over the Internet. The corporation feeds that content to its large language models (LLMs for short) to teach them how to predict the next word in a sentence and formulate the human-like responses we are accustomed to seeing from ChatGPT. Microsoft has a partnership with OpenAI that allows them to use GPT software in their own AI bot, Bing Copilot. These companies are tight-lipped about how they use copyrighted content in their AI, other than citing laws of “fair use.” In this context, they say they can use any copyrighted material as long as they have paraphrased it or edited it so it significantly differs from the original work.
In their lawsuit, The New York Times accuses OpenAI and Microsoft of using their copyrighted content to train their LLMs, even though The New York Times has not allowed them to do ANYTHING with it. Additionally, using The New York Times’s content allows ChatGPT and Bing Copilot to compete directly with the Times’s own services. In its lawsuit, the Times was able to surface dozens of examples of ChatGPT and Bing Copilot spitting out the Times’s articles word-for-word without any form of attribution.
OpenAI countered this accusation by restating the aforementioned fair use laws that AI companies tend to rely on. Moreover, OpenAI pointed out that The New York Times’s examples were generated when users asked Bing Copilot to show the article itself, leaving the AI with no choice but to pull up the article directly from the web. Meanwhile, Microsoft maintained that it used OpenAI’s technology, not its own, to create Bing Copilot, and was therefore not liable for any copyright infringement in that regard.
When reading the previous paragraph, you might have asked, Why is The New York Times even bothering to sue? After all, it seems that OpenAI and Microsoft were effectively able to shred the newspaper’s most compelling arguments, so any legal action would likely be futile, right? Well, it is believed that this lawsuit is intended to be more of a power play by the Times than an actual case of copyright infringement. Probably, the Times is hoping to gather public attention with the lawsuit or reach some sort of monetary settlement. Also, there has been some speculation that the paper is out to attack enemies of Google after Google paid the Times millions of dollars to use their content. But mostly, the Times is looking to garner public attention and buy some time before as AI threatens the digital news industry.
Overall, this lawsuit is yet another example of how the “old guard” of renowned news establishments and the “new guard” of generative AI are clashing over the murky topic that is copyright infringement.