Suing the Machine: The Clash Between Copyright Law and AI Innovation

Kenneth DiFilippo

Staff Editor, Delaware Journal of Corporate Law, Volume 50

Introduction

    Innovation is everything in the technology sector. Artificial Intelligence (“AI”) technologies have exploded in recent years, drawing attention from news outlets and ultrawealthy investors seeking to understand, and profit, from what some think may be a new age of technology.[1] However, training AI models is a costly, complex process fueled by mountains of data.[2] This blog will briefly discuss how AI models are trained and analyze how a recent Third Circuit decision could pave the way for many successful legal challenges to machine learning projects.

    Ghost in the Machine

      While the average layman’s mind may immediately conjure up visions of metal men and killer computer codes when they hear the term AI, these computer programs are far more limited than those seen in fiction.[3] Many of the most widely-known AI applications today—ChatGPT, Alexa, Google Assistant, and self-driving car systems—utilize previously analyzed data to “react” to a given stimulus and choose a course of action most conducive to achieving its task.[4] However, to be able to achieve this reaction,  AI programs must be “trained” through a process called machine learning.

      Machine learning is the process of computers “detect[ing] patterns in massive datasets” and “mak[ing] predictions based on what the computer learns from those patterns,” mimicking the human ability to “perceive, learn, and problem solve.”[5] To create these massive datasets, AI companies collect as much text, photography, and numerical information as possible—from bank statements to photos of fauna.[6] This data is often collected by “web crawlers” and “web scrapers” (collectively, “data collectors”), automated computer programs which catalog and extract information from the internet—website by website, link by link.[7] “The more data [collected], the better the program.”[8]

      The need for massive amounts of training data in a competitive, fast-moving environment has raised serious data protection concerns. Data collectors can reach almost any website that does not require a login.[9] This leaves broad amounts of data, copyrighted and not, available for automated collection—including, for example, this very blog.[10] Naturally, this largely unregulated practice of automatically collecting and using online data to train for-profit AI programs has led to an increasing waive of intellectual property (“IP”) suits being filed.[11]

      The First Domino

        The United States District Court for the District of Delaware delivered a major victory for IP holders, becoming the first court to decide, among other issues, that the use of copyrighted materials to train an AI model did not fall under the defense of fair use.[12] In Thomson Reuters Enterprise Centre GMBH v. Ross Intelligence, Inc. (“Thomson Reuters”), Thomson Reuters, owner of the Westlaw legal research platform, brought suit against Ross, Inc. (“Ross”) for copyright infringement.[13] Both parties filed motions for summary judgement.[14] While Judge Bibas initially found disputes of material fact concerning Ross’s fair use defense,[15] the Court later reversed itself: holding that Thomson Reuters was entitled to summary judgement on the issue of fair use.[16]

        In its analysis of fair use, the Court ascertained that the character of Ross’s data usage was distinctly “commercial” rather than “transformative.”[17] In other words, Ross sought merely to profit from Thomson Reuters’s protected material, rather than employ it for its own distinct purpose.[18] Judge Bibas also determined that Ross’s use of Westlaw’s copyrighted materials was directly adverse to Thomson Reuters, furthering the development of a rival legal research platform and interfering with Thomson Reuters’s right to license its data for profit.[19] These factors together weighed in favor of the Court invalidating Ross’s affirmative defense.[20]

        An Uncertain Precedent

          In the wake of this novel ruling, there is much uncertainty as to how other courts will apply the Court’s analysis, especially in relation to generative AI platforms.[21] While Judge Bibas specifically noted that generative AI was not at issue in the case,[22] courts considering similar copyright infringement litigations involving generative AI will have to decide whether to adopt or reject this analysis. Furthermore, even where non-generative AI is at issue, the fair use analysis is exceedingly fact specific.[23]

          A case similar to Thomson Reuters,but for it involving generative AI, would likely hinge on the transformative nature of the AI model’s output.[24] Considering the two fair use factors discussed in Thomson Reuters individually—(1) whether the defendant’s use is more transformative than commercial and (2) how the new use affects the original work’s market value[25]—the market-value factor is unlikely to be disputed. It is hard to envision a scenario where a screenwriter is not financially exploited by AI-generated scripts based on their work, a struggling artist does not lose clients to AI recreating new pieces in their style, or a digital design company is not harmed by AI models creating unique logos based on their previous designs. Furthermore, while many AI companies feature free trials and programs for their machine learning products, the costly nature of developing these technologies is leading to increasingly monetized forms of AI.[26] Therefore, courts wrestling with whether fair use is an effective defense for generative AI defendants will likely focus on the transformative nature of such a program—a question left completely unanswered by Judge Bibas.

          Beyond the legal application of Thomson Reuters, the case leaves the Third Circuit, in particular the District of Delaware, in a state of limbo. The floodgates are open and AI litigation is here to stay. Courts across the country are faced with all manner of AI-related litigation, all of which offer opportunities for jurisdictions to become more, or less, accommodating to AI development than Judge Bibas was. As more and more jurisdictions wrestle with the conflict between machine learning and IP infringement, the question remains: has the District of Delaware become the leading voice in a rapidly growing field of legal thought, or will Thomson Reuters lead to broader legal disputes between circuits on the nature of AI litigation?

          About the Author

          Kenneth is a second-year law student at Widener University Delaware Law School and is a Staff Editor for Volume 50 of the Delaware Journal of Corporate Law. He is the incoming Editor-in-Chief for Volume 51 of the Delaware Journal of Corporate Law. Kenneth graduated summa cum laude from Immaculata University in 2024, earning his bachelor’s degree with a major in Criminology. While in law school, he has worked as a legal intern at Lenox Law Firm in Lawrenceville, New Jersey and Edelstein, Martin, & Nelson in Philadelphia, Pennsylvania. In the summer of 2025, Kenneth will work as a legal intern at Begley, Carlin & Mandio in Langhorne, Pennsylvania. After graduating from law school, Kenneth plans to use his skill in legal writing, oral advocacy, and negotiation to work in civil law in either Pennsylvania, New Jersey, or Delaware.


          [1] Hersh Shefrin, Big Bets on AI in an Overvalued Environment, Forbes (Feb. 28, 2025), https://www.forbes.com/sites/hershshefrin/2025/02/28/investors-are-making-big-ai-based-sentiment-bets; David Ly, What’s Next for AI: The Next Wave of Use Cases in 2024 and Beyond, Forbes (Feb. 2, 2024), https://www.forbes.com/councils/forbestechcouncil/2024/02/02/whats-next-for-ai-the-next-wave-of-use-cases-in-2024-and-beyond; Evan Reas, The Next Big Wave: AI is the Next Big Technological Wave, Ohio Wesleyan Univ., https://www.owu.edu/alumni-family-friends/owu-magazine/spring-2024/artificial-intelligence-in-a-real-world/the-next-big-wave-ai-is-the-next-big-technological-wave (last visited Mar. 9, 2025).

          [2] Sara Brown, Machine Learning, Explained, MIT Mgmt. Sloan Sch. (Apr. 21, 2021), https://mitsloan.mit.edu/ideas-made-to-matter/machine-learning-explained; Michael Chen, What is AI Model Training & Why is it Important?, Oracle (Dec. 6, 2023), https://www.oracle.com/artificial-intelligence/ai-model-training.

          [3] Understanding the Different Types of Artificial Intelligence, IBM (Oct. 12, 2023), https://www.ibm.com/think/topics/artificial-intelligence-types (explaining that “the only type of AI that exists today” is “Weak AI,” only capable of “perform[ing] a single or narrow task”).

          [4] Id.

          [5] DOE Explains . . . Machine Learning, U.S. Dep’t of Energy, https://www.energy.gov/science/doe-explainsmachine-learning (last visited Mar. 9, 2025); see Brown, supra note 2.

          [6] Brown, supra note 2.

          [7] Lauren Leffer, Your Personal Information is Probably Being Used to Train Generative AI Models, Sci. Am. (Oct. 19, 2023), https://www.scientificamerican.com/article/your-personal-information-is-probably-being-used-to-train-generative-ai-models.

          [8] Brown, supra note 2.

          [9] Leffer, supra note 7.

          [10] Id.

          [11] Thomson Reuters Enter. Ctr. GMBH v. Ross Intel., Inc. (Thomson Reuters I), 694 F. Supp. 3d 467, 475 (D. Del. 2023); Complaint at 2, Authors Guild v. OpenAI Inc., No. 23-cv-8292 (S.D.N.Y. filed Sept. 19, 2023); Complaint at 1, Getty Images (US), Inc. v. Stability AI, Inc., No. 23-135 (D. Del. filed Feb. 3, 2023); Complaint at 1, Nazemian v. NVIDIA Corp., No. 24-cv-2655 (N.D. Cal. filed Mar. 8, 2024).

          [12] Thomson Reuters Enter. Ctr. GMBH v. Ross Intel., Inc. (Thomson Reuters II), No. 20-cv-613, 2025 WL 458520, at *10 (D. Del. Feb. 11, 2025). Fair use is an affirmative defense to a claim of copyright infringement. In determining whether the defendant has successfully claimed fair use, a court must balance four factors: “(1) the use’s purpose and character, including whether it is commercial or nonprofit; (2) the copyrighted work’s nature; (3) how much of the work was used and how substantial a part it was relative to the copyrighted work’s whole; and (4) how [defendant’s] use affected the copyrighted work’s value or potential market.” Id. at *6–7.

          [13] Thomson Reuters I, 694 F. Supp. 3d at 475–76.

          [14] Id. at 475.

          [15] Id. at 481–87.

          [16] Thomson Reuters II, 2025 WL 458520, at *1.

          [17] Id. at *7–8.

          [18] Id. at *7.

          [19] Id. at *9–10.

          [20] Thomson Reuters II, 2025 WL 458520, at *10.

          [21] Dustin Taylor, Delaware Court Grants Summary Judgement to Plaintiff in Machine Learning/AI Copyright Case, Husch Blackwell, (Feb. 12, 2025), https://www.huschblackwell.com/newsandinsights/delaware-court-grants-summary-judgment-to-plaintiff-in-machine-learning-/-ai-copyright-case; Jim Vana, Delaware Court Delivers First Copyright Verdict on AI Training, Schwabe, Williamson & Wyatt (Feb. 19, 2025), https://www.schwabe.com/publication/delaware-court-delivers-first-copyright-verdict-on-ai-training. The term “Generative AI” refers to AI programs that create new data similar to or stylistically consistent with the datasets it was trained on. Adam Zewe, Explained: Generative AI, MIT News (Nov. 9, 2023), https://news.mit.edu/2023/explained-generative-ai-1109.

          [22] Thomson Reuters II, 2025 WL 458520, at *7–8.

          [23] See id. at *6–10.

          [24] Vana, supra note 21.

          [25] See Thomson Reuters II, 2025 WL 458520, at *6–10.

          [26] See Aditya Soni, et al., OpenAI Outlines New For-Profit Structure in Bid to Stay Ahead in Costly AI Race, Reuters (Jan. 2, 2025), https://www.reuters.com/technology/artificial-intelligence/openai-lays-out-plan-shift-new-for-profit-structure-2024-12-27.


          Comments

          Leave a Reply

          Discover more from Delaware Journal of Corporate Law Blogs

          Subscribe now to keep reading and get access to the full archive.

          Continue reading