ballistic missiles,Defense,Doctrine,North Korea,Nuclear,PLA,SLBM,Submarines

AI, Copyright, and the Future of Creativity

Debajyoti Chakravarty

The rapid growth of generative AI has sparked global debate on whether using copyrighted works for AI LLM training constitutes permissible use or infringement. This paper undertakes a comparative analysis of copyright laws, policy frameworks, and judicial approaches across the United States, European Union, United Kingdom, China, Japan, Singapore, and India. It finds an emerging convergence around lawful access, transparency, technical safeguards, and licensing solutions, albeit using different legal approaches. Examining key cases, it finds that courts are increasingly emphasising on market harm, lawful data sourcing, and evidentiary standards over categorical AI model training bans. Situating India within this global landscape, this paper argues that India’s reliance on fair dealing provisions in this context creates legal uncertainty for both creators and innovators. It proposes a balanced framework that combines statutory clarity, licensing mechanisms, and adaptive regulatory oversight to boost innovation while protecting creators.

Attribution:

Debajyoti Chakravarty, “AI, Copyright, and the Future of Creativity,” ORF Occasional Paper No. 547, Observer Research Foundation, May 2026.

Introduction

The rise of Artificial Intelligence (AI) has transformed the global landscape of creativity, innovation, and regulation. From generative models that compose music and literature to systems capable of producing complex analytical or visual works, AI now stands at the intersection of human ingenuity and machine learning. This transformative potential raises a central question of the digital era: how should copyright law respond to the use of protected works in training AI models? As algorithms learn from vast datasets of books, images, and other creative outputs, traditional notions of authorship, originality, and ownership are being challenged in unprecedented ways.

Across jurisdictions, governments, policymakers, and courts are grappling with this dilemma. Recent developments in the United States (US), Europe, China, and Singapore chart divergent yet intersecting paths in regulating the use of copyrighted material in AI training. Landmark judgements in the US have sought to reconcile innovation with protection by interpreting doctrines such as “fair use” in the context of AI. The European Union (EU), for its part, has adopted a more precautionary approach through its AI Act and text and data mining exceptions, while jurisdictions such as China and Singapore are developing governance models that combine regulation, transparency, and accountability.

India stands at a critical juncture in this debate. As one of the fastest-growing AI markets, with large creative industries and an evolving digital ecosystem, its response to the copyright implications of AI training will have far-reaching consequences. The existing legal framework under the Copyright Act, 1957 provides certain protections but offers limited clarity on the permissibility of using copyrighted works for training AI models.

This paper explores how global statutes, policy frameworks, and case law are shaping the AI-copyright debate and outlines the lessons for India. It analyses patterns of convergence and divergence across leading jurisdictions and reflects on pathways to balance innovation, creativity, and protection in India’s AI ecosystem. Through this comparative lens, it outlines the contours of a future-ready copyright framework that fosters technological advancement while safeguarding creative integrity.

Copyright Law in AI Training: A Comparison of Seven Jurisdictions

The rapid expansion of generative AI has triggered legal and policy debates over whether training AI models on copyrighted materials constitutes permissible use or copyright infringement. The seven jurisdictions analysed in this paper—the US, the United Kingdom (UK), the EU, China, Japan, Singapore, and India—have adopted distinct legal doctrines, statutory frameworks, and policy approaches to reconcile the tension between technological innovation and the protection of creative works. Despite varying legal traditions, several converging themes emerge: lawful access to data, the need for transparency, the role of licensing markets, and the balancing of innovation with creators’ rights.

United States: The ‘Fair Use’ Doctrine and Doctrinal Flexibility

The US remains anchored in the flexible “fair use” doctrine codified in 17 U.S. Code, Section 107,^[1] which permits certain uses of copyrighted works for purposes such as criticism, research, and scholarship.^[2] Courts assess fair use on a case-by-case basis using four variables:^[3] the purpose and character of the use (including its commercial nature); the nature of the copyrighted work; the amount and substantiality of the portion used; and the effect of the use on the work’s market value.^[4] This open-ended test allows for contextual judicial interpretation, giving courts latitude to assess whether AI training is “transformative”—that is, whether it produces new insights or functionalities rather than substituting for the original work.

Recent policy discussions by the US Copyright Office suggest that using copyrighted material through lawful access may fall within “fair use”, particularly for research or innovation.^[5] However, there is no explicit text and data mining exception addressing AI training. Courts across the US have therefore interpreted the fair use doctrine differently in such cases.

Use of a work does not automatically preclude fair use because it is commercial;^[6] a transformative use,^[7] even if commercial, may still qualify.^[8] This flexibility has encouraged AI research and model training. However, in the absence of an explicit legislative exception—such as those in the EU—and with varying judicial interpretations across courts, it has also created uncertainty in application.

United Kingdom: Lawful Access and Emerging Opt-Out Framework

The UK’s approach, though still evolving, is more prescriptive by nature. Section 29A of the Copyright, Designs and Patents Act 1988 provides an exception for “copies for text and data analysis for non-commercial purpose,”^[9] provided the researcher has lawful access to the work. However, this exception may not extend to commercial AI training, limiting its applicability to academic contexts.

Recognising the growing relevance of AI, the UK government’s 2024 Consultation on Copyright and Artificial Intelligence^[10] proposes a balanced framework allowing AI developers to train on lawfully accessed materials unless rights holders expressly reserve their rights. The consultation envisages an “opt-out” system with increased transparency, requiring developers to disclose training datasets alongside licensing mechanisms and collective rights management. This marks a shift toward an EU-style opt-out regime that prioritises both innovation and control, though its effectiveness would depend on implementing standardised rights-reservation and transparency mechanisms.

European Union: Structured Exceptions and Obligations Consolidated Under the AI Act

The EU provides the most comprehensive and the earliest statutory architecture through the Directive (EU) 2019/790 on Copyright in the Digital Single Market (DSM Directive)^[11] and the 2024 Artificial Intelligence Act (AI Act).^[12] The DSM Directive establishes two key text and data mining (TDM) exceptions. Article 3 permits reproductions and TDM by research organisations and cultural heritage institutions for scientific research, while Article 4 extends TDM to any actor for any purpose, provided the content is lawfully accessed and rights holders have not opted out through machine-readable means. This dual-tiered system creates both a protected research exception and a conditional commercial exception, distinguishing the EU from both the permissive US fair use model and the more limited UK regime.

The EU AI Act (2024) further integrates copyright compliance into AI governance. Its recital 105 explicitly acknowledges that AI model training may involve copyrighted works and requires authorisation unless an exception applies. Article 53 further obliges general-purpose AI model providers to adopt copyright compliance policies, respect opt-outs under the DSM Directive, and publicly disclose summaries of training datasets. This introduces transparency and accountability directly into AI regulation—a marked evolution beyond traditional copyright frameworks. The EU thus couples copyright protection with proactive governance, ensuring that large-scale AI systems adhere to lawful sourcing and disclosure standards.

China: Legal Source Requirement and State-Led Governance

China’s regulatory approach emphasises legality, data integrity, and state oversight rather than nuanced exceptions. The Interim Measures for the Management of Generative Artificial Intelligence Services (2023)^[13] require “Generative artificial intelligence service providers” to “use data and basic models from legal sources” and prohibit infringement of intellectual property. The Chinese Copyright Law (Article 24^[14]) lists 13 exhaustive cases of permitted use “without the author's permission or payment of remuneration,” including research and education, but contains no explicit exception for text and data mining.

However, the Supreme People’s Court’s Opinions (Article 8^[15]) suggest a functional approach: “uses” may be deemed “reasonable” if they promote “technological innovation and business development” without harming authors’ lawful interests or the original use of the work. Such determinations require consideration of the “nature and purposes of use, the nature of works used, the quantity and quality of the part of works used, impacts of use on potential markets or values, and other factors,” introducing a quasi-fair-use test. Thus, while China lacks a statutory AI-specific exception, this flexible interpretation of “reasonable use,” coupled with a strong legality requirement, creates a controlled but adaptive framework that balances copyright holder rights with innovation.

Japan: Non-Enjoyment Test and Context-Based Limitations

Japan’s Copyright Law (Article 30-4^[16]) is perhaps one of the most AI-forward regimes. It permits the exploitation of copyrighted works for “data analysis”, defined as extracting, classifying, or statistically processing elements of works, provided the purpose is not to “personally enjoy the thoughts or sentiments expressed.” This “non-enjoyment” test^[17] allows use for technological development and AI training, so long as it does not unreasonably prejudice the copyright owner’s interests.

In its 2024 “General Understanding on AI and Copyright in Japan Overview,” Japan’s copyright Office defined this “enjoyment^[18]“ as “the act of obtaining the benefit of having the viewer’s intellectual and emotional needs satisfied through using the copyrighted work,” citing examples such as reading literary works, appreciating music, and executing computer programmes. Thus, generating material similar to original works may qualify as “enjoyment,” and if the user’s purpose is even partly for enjoyment, the exception may not apply.

However, the Act does not explicitly address where copyrighted database works are reproduced for data analysis—such as AI training—despite licences for such uses being available in the market.

Singapore: Permissive Computational Data Analysis Exception

Singapore’s Copyright Act 2021 establishes a clear, technologically attuned exception for “computational data analysis” (CDA). Under Sections 243 and 244,^[19] copying is permitted for CDA purposes, including improving a programme’s functioning—potentially encompassing AI training. The only conditions are lawful access and non-circumvention of paywalls or licence restrictions. Accordingly, AI developers may use copyrighted data for training, provided access is lawful. Singapore thus adopts one of the most permissive frameworks globally, combining statutory certainty with defined scope.

Singapore’s Model AI Governance Framework^[20] (2024) further reinforces a policy preference for dialogue, transparency, and licensing, urging a balance between creators’ rights and innovation while avoiding rigid restrictions. The overall orientation is pro-innovation, emphasising ethical compliance and stakeholder engagement rather than legal prohibitions.

India: Application of Existing Copyright Law

India’s Copyright Act, 1957 (as amended) does not include an AI-specific or TDM exception. Section 52^[21] provides fair dealing exceptions for research, criticism, reporting, and review, requiring that ‘use’ be “purpose-specific”^[22] in nature and align with one or more of Section 52’s statutory exceptions for the fair dealing defence to apply.

Comparative Trends, Convergences, and Divergences

Across the seven jurisdictions discussed in this paper—the US, India, China, the UK, the EU, Japan, and Singapore—regulation of AI training on copyrighted material shows convergence in policy concerns but divergence in legal mechanisms. Two broad approaches emerge: one relies on flexible judicial doctrines such as fair use or fair dealing; the other on explicit statutory TDM or computational data analysis exceptions.

Countries such as the US, India, and China largely follow the first model. In the US, the precedent-driven “fair use” doctrine allows courts to assess AI training through factors such as “transformative use” and market impact, enabling contextual flexibility but creating uncertainty due to varying judicial interpretations. India’s framework under the Copyright Act, 1957 similarly relies on “fair dealing” provisions, though these are narrower and purpose-specific; its four exceptions do not explicitly permit the use of copyrighted material for AI training. With cases pending before the Supreme Court of India, the possibility remains that courts may interpret these exceptions to cover such uses. China combines a strict legality requirement with regulatory oversight under the Interim Measures for the Management of Generative Artificial Intelligence Services, which mandate the use of data and basic models from “legal sources” and respect for intellectual property. However, evolving People’s Court Opinion on “reasonable use” of data for promoting “technological innovation and business development” potentially resembles a quasi-fair-use mechanism. It seeks to balance copyright holder rights as well as the innovative use of such work.

In contrast, the EU, Japan, the UK, and Singapore rely more on statutory TDM-style exceptions that directly address computational analysis of copyrighted works. The EU provides a dual-tier framework under the Directive (EU) 2019/790 on Copyright in the Digital Single Market, permitting research-oriented TDM and broader commercial TDM subject to rights-holder opt-outs. This is reinforced by the Artificial Intelligence Act, which introduces transparency and dataset disclosure obligations for general-purpose AI models, integrating copyright governance with AI regulation. The UK provides a narrower TDM exception limited to non-commercial research but is considering reforms that may introduce opt-out and transparency mechanisms similar to the EU.

Japan adopts an innovation-oriented regime that permits data analysis provide the use is not for “enjoyment” of the work, explicitly enables AI training, and reflects deliberate statutory reform to facilitate technological development. Singapore’s Copyright Act 2021 establishes a computational data analysis exception allowing use of copyrighted data for data analysis where access is lawful, while maintaining safeguards against circumvention of licence restrictions.

Despite these divergent legal pathways, several converging trends are evident. Most jurisdictions increasingly emphasise lawful access to data, transparency regarding training datasets, and the growing importance of licensing mechanisms. At the same time, the landscape shows clear divergence in how innovation and creators’ rights are balanced: flexible judicial interpretation-led doctrines produce adaptability but uncertainty, whereas statutory TDM regimes offer clarity but could require continuous legislative updating.

A broader shift is underway in copyright governance, particularly in jurisdictions such as Singapore, the EU, and Japan, where AI-specific statutory reforms, transparency obligations, and opt-out mechanisms are reshaping the use of copyrighted works in technological development. In parallel, licensing arrangements—both bilateral agreements and emerging debates on collective licensing—are becoming an important complement to formal rules. Together, these developments suggest a global shift toward hybrid governance structures that combine copyright law, AI regulation, and market-based licensing to manage the use of copyrighted material in training large AI models.

Global Copyright Policies on AI Training

The policy landscape governing the use of copyrighted materials for training large language models (LLMs) across the US, the EU, the UK, China, Japan, Singapore, and India reveals convergence in policy objectives but divergence in regulatory philosophy. A consistent principle across jurisdictions is the requirement of lawful access to training data. Governments and regulators emphasise that AI developers must obtain copyrighted materials through legitimate means and must not circumvent technological protection measures or paywalls. This is reflected in the EU’s General‑Purpose AI Code of Practice,^[23] the UK’s Consultation on Copyright and Artificial Intelligence,^[24] both of which emphasise that text and data mining (TDM) activities should respect rights-holder reservations via machine-readable opt-outs such as robots.txt or otherwise. Similarly, the US policy in Copyright and Artificial Intelligence Part 3: Generative AI Training^[25] underscores that using copyrighted works obtained through illegal access would fall outside the boundaries of its ‘fair use.’^[26]

Beyond lawful access, another area of convergence is the growing preference for market-based solutions, particularly licensing arrangements, rather than the creation of entirely new proprietary rights over training data. The US has explicitly supported the development of voluntary “licensing markets”^[27] and collective licensing agreements as a way to balance innovation with creators’ interests, arguing that government intervention as of now is premature. The EU^[28] and UK likewise favour licensing and opt-out systems as practical mechanisms that allow rights holders to maintain control while enabling technological innovation. Their policy frameworks emphasise transparency obligations, dataset disclosures,^[29] and stakeholder agreements between technology companies and creative industries. Japan’s policy focuses on dialogue and mutual understanding between creators, AI developers, and service providers.

Technical and organisational safeguards are another shared feature. The EU Code of Practice for general-purpose AI models: Copyright Chapter envisions signatory companies implementing measures that prevent AI models from generating outputs infringing on copyrighted content and adopting policies prohibiting infringing uses.

China’s strategy^[30] likewise stresses risk-assessment mechanisms, strict adherence to intellectual property rules, and structured mechanisms for lawful data trade. Japan encourages establishing rules and best practices among all stakeholders, while India continues to emphasise enforcement and bolstering of its existing framework to address unauthorised uses of creative works. In this sense, the policy consensus is not merely about the legality of training data but also about responsible behaviour of AI systems, including safeguards against infringing outputs.

Despite these shared policy concerns, jurisdictions diverge in regulatory architecture. The US continues to rely on a precedent-driven, case-by-case^[31] application of fair use, allowing contextual evaluation but creating uncertainty as legality is often determined through litigation. In contrast, the EU and the UK increasingly favour structured statutory frameworks built around text-and-data-mining exceptions, combined with rights-holder opt-out systems, machine-readable reservations, and transparency obligations. These aim to create predictable legal boundaries while preserving rights-holders’ ability to reserve their works from AI training.

However, practical challenges remain within this model, including the implementation of machine-readable reservations,^[32] permissible data retention periods, and ensuring compliance.^[33] China represents a distinct state-directed^[34] governance model, embedding AI training policies within broader strategies for data governance, economic development, and international technological cooperation. Chinese policy emphasises compliance with intellectual property and privacy rules, reflecting a model where government oversight plays a central role in shaping AI development.

Recent developments in Singapore and India illustrate emerging hybrid approaches. Singapore’s 2024 consultation on TDM exceptions for training LLMs for AI modules^[35] regarding its computational data analysis regime under the Copyright Act 2021 reaffirm the importance of lawful access safeguards while clarifying that authorities will not introduce a new exception allowing circumvention of access-control measures for AI model training purposes. This reflects a deliberate policy choice to preserve technological protection measures as a mechanism through which rights holders can structure digital business models and licensing arrangements.

Policymakers have emphasised that access-control mechanisms remain important not only for copyright holders’ remuneration but also for maintaining the security and stability of digital systems. The consultation process also highlighted the growing role of structured licensing partnerships, with publishers and technology companies entering into agreements that provide curated training datasets, API-based access, and enhanced citation systems that prove beneficial for AI models.^[36] Singapore therefore illustrates a model where statutory exceptions coexist with controlled access mechanisms and commercially negotiated licensing ecosystems.

India, meanwhile, is exploring a collective licensing model through its Working Paper, One Nation One License One Payment, proposing a blanket licence for AI training^[37] with royalties payable upon commercialisation and administered through a centralised mechanism.^[38] Taken together, these developments reveal a global policy environment characterised by shared objectives but divergent institutional strategies. Most jurisdictions converge on the principles of lawful data sourcing, transparency in training practices, respect for rights-holder reservations, and the need for safeguards against infringing outputs. At the same time, they differ in how these goals are operationalised: the US relies on doctrinal flexibility and litigation; the EU and UK build structured opt-out and transparency regimes; China integrates copyright governance into broader state-led AI strategies;^[39] Japan emphasises cooperative ethics and stakeholder communication;^[40] Singapore reinforces lawful access through technological safeguards and licensing partnerships; and India is exploring collective licensing frameworks to reduce transaction costs.

Overall, AI copyright governance is moving towards hybrid models that combine copyright law, licensing markets, transparency obligations, and technical safeguards. While no single framework has yet emerged as a global standard with universal consensus, the convergence around lawful access, licensing as a potential means of obtaining copyrighted data for LLM training, a spotlight on transparency obligations and responsible AI deployment indicates the early contours of an evolving harmonised and ethical international policy architecture for managing copyright issues in the age of generative AI.

Judicial Approaches to Copyright in AI Training

This section examines how courts across key jurisdictions are responding to disputes arising from the use of copyrighted works in AI model training and how judicial precedent is shaping the ambit of permissible use. Unlike statutes and policy instruments, which operate at a general level, case law shows how legal principles apply to concrete scenarios such as mass scraping, dataset creation, model training, and the generation of potentially infringing outputs. Courts are therefore becoming the primary sites where these laws and policies are tested out in real-time.

The previous section demonstrated that jurisdictions have adopted divergent statutory and policy approaches, ranging from flexible doctrines such as fair use and reasonable use to structured TDM exceptions, opt-outs, and licensing frameworks. Case law provides the missing interpretive layer, revealing how courts fill gaps, limit overbroad claims, and recalibrate policy ambitions.

By analysing leading and ongoing cases from the US, the EU, the UK, India, and China, this section traces emerging judicial patterns and divergences. It shows how courts are translating policy goals into enforceable standards, testing the effectiveness of existing frameworks, and, in some instances, implicitly guiding future regulatory reform. Together, these cases demonstrate that the legal governance of AI training is being shaped as much by litigation as by legislation.

United States: ‘Fair Use’ on Trial

The US cases sketch a fast-evolving but increasingly patterned landscape. Plaintiffs, including legal publishers, media houses, authors, artists, coders, musicians, and voice actors—allege that AI developers copied protected works at scale to train models and, in some instances, facilitate market-substituting outputs. Defendants counter that training constitutes fair use, that models store only statistical representations rather than expressive content, that output lacks substantial similarity, and that many state-law claims are pre-empted by federal copyright law.

In J. Doe v. GitHub,^[41] the plaintiffs (doe) sued GitHub over the development and operation of Copilot and Codex,^[42] alleging reproduction of code snippets identical or substantially similar to their copyrighted works without attribution. The court allowed certain breach-of-contract^[43] claims based on open-source licences to proceed and recognised standing for some plaintiffs, while dismissing several state-law claims as preempted.

Other cases remain active and procedurally varied. In the Re: OpenAI^[44] case, the court opined on a key procedural issue,^[45] centralising^[46] overlapping disputes against OpenAI and Microsoft under multidistrict Litigation. That though different plaintiffs have different claims, such as copyright violation allegations in training LLMs on copyrighted materials like books, news articles, and YouTube video transcripts without consent and metadata removal, the AI-generated outputs, among others, overlapping factual issues justified such centralisation.

In Andersen v. Stability AI,^[47] courts allowed key copyright claims^[48] by artists, including Sarah Andersen, to proceed to discovery.^[49] Rejecting the argument that models only contain ‘unprotectable data’ that “exist as statistical representations”,^[50] rather than actual images, the court noted that generative AI systems are built on protected content; whether such works are “contained” in models remains a matter for trial.

The writers’ cases against Anthropic (Bartz v. Anthropic^[51]) and record labels’ suits against AI music platform ‘Udio’ (Universal Music Group v. Udio^[52]) further highlight the contested methods of training AI models on copyrighted books and music without due licensing, compensation, or attribution to rights holders.

In the case of Bartz,^[53] the court distinguished between lawful and unlawful data sources in assessing fair use. It held that using lawfully purchased and digitised books to train Claude’s LLMs was transformative and justified under fair use, as the training process created new, non-substitutive outputs and caused no market harm. However, the use of pirated books to build Anthropic’s internal library lacked lawful acquisition and directly affected and displaced legitimate market demand. Thus, the court granted summary judgment^[54] for Anthropic on fair use for training and purchased copies but denied it for pirated materials. This resulted in a substantial settlement in favour of the Bartz by Anthropic of US$1.5 billion.^[55] Interestingly, the key argument that pirated works are the source material for training LLMs cannot attract a fair use exception and which subsequently led to the historic settlement in the case of Anthropic was rejected in the case of Meta and ultimately deemed ‘fair use’ there.^[56]

In Kadrey et al. v. Meta Platforms Inc. (summary judgement of 2025), Judge Chhabria held^[57] that Meta’s use of pirated^[58] copyrighted books to train its Llama model qualified as fair use. The court found the training purpose highly transformative, the amount copied reasonable, and no evidence of market harm, thus favouring Meta on three of four fair-use factors. However, it acknowledged that future plaintiffs could prevail if they prove market dilution or indirect substitution caused by AI-generated content.^[59] The ruling, though limited to the 13 authors, signals that the fourth factor—market impact—could become more decisive in future AI copyright disputes.

In Ziff Davis v. OpenAI,^[60] the Court curtailed key copyright and Digital Millennium Copyright Act (DMCA) claims, holding that bypassing robots.txt restrictions does not amount to circumvention of a “technological protection measure” under the DMCA.^[61] It also dismissed “unjust enrichment” claims as pre-empted, reinforcing limits on state-law challenges.

Additional cases—including In re Google Generative AI,^[62] Concord Music Group v. Anthropic,^[63] O’Nan v. Databricks,^[64] Milette v. Nvidia^[65]—raise similar issues around large-scale scraping, dataset construction, and the reproduction of copyrighted content in outputs. Parallel suits^[66] such as Lehrman v. Lovo^[67] and^[68] Vacker v. ElevenLabs^[69] extend the debate to voice cloning, testing the intersection of copyright and right of publicity.

Across these dockets, the key legal and ethical questions before US courts include the following:

Fair Use and Transformativeness in AI Training: US Courts are examining whether training AI models on copyrighted works without permission qualifies as fair use, particularly when models and their outputs are commercial and may substitute licensed human-created content. A central issue is how to evaluate “transformativeness”^[a] when AI tools generate competing research, code, images, music, or text derived from original copyrighted materials.

Derivative Works and the Nature of AI Models: A key question is whether AI models or their outputs qualify as “derivative works.” Courts are also assessing whether copyrighted materials embedded during training persist within models merely as statistical representations, or whether they remain legally recognisable reproductions of the original works.

Infringement Standards and Evidentiary Thresholds: Cases focus on what level of “substantial similarity” or regurgitation in AI outputs is necessary to establish copyright infringement and how plaintiffs can demonstrate that specific copyrighted works materially influenced a model’s generated output.

Regulatory and Contractual Constraints: Cases are probing the applicability of the DMCA, particularly rules on removal or alteration of copyright management information to AI model training datasets and outputs, as well as whether open-source licences, terms of service, or robots.txt restrictions legally bind AI model developers.

Market Harm, Pre-emption, and Emerging Rights: Courts are considering market substitution effects, especially where established licensing markets exist (e.g., journalism, lyrics, and image libraries), alongside questions of federal copyright pre-emption over state tort claims and emerging personality rights in contexts such as voice cloning.

Ethically, these cases bring to the forefront consent, attribution, compensation, and transparency in data collection. In practical terms, they suggest that licensing, robust anti-scraping safeguards, and auditable training documentation will matter as much as legal doctrine in shaping the bounds of AI training on protected works.

European Union: Judicial Understanding of TDM and its Exceptions

Across these EU matters, courts and parties are testing how existing copyright laws—the InfoSoc Directive’s three-step test and the newer EU Digital Single Market (DSM) Directive TDM exceptions, along with press publishers’ rights apply to AI practices. Defendants generally assert that copying is lawful where TDM works are lawfully accessible, that such training is either non-commercial scientific research or a lawful TDM activity, and that machine learning stores only non-copyrightable statistical representations; plaintiffs: publishers, photographers, collecting societies, and authors counter that scraping and ingestion without consent or remuneration undermines licensed markets, violates explicit opt-outs, and harms markets when models reproduce or closely mimic protected expression.

Where courts have decided, results have been mixed but instructive. Amsterdam’s Howards Home ruling (DPG Media v Howards Home^[70]) interpreted the DSM exceptions in favour of the defendant. It held that short snippets (approx. 20 words) fell under the “very short extracts”^[71] exception of article 15^[72] of DSM and applied the Berne three-step test^[73] and concluded that the use did not interfere with the normal exploitation or “unreasonably prejudice” the rights holders. Germany’s Kneschke v. LAION^[74] treated the dataset maker’s mass scraping to create a dataset as qualifying for non-commercial scientific research^[75] (and flagged that natural-language reservations are ineffective against statutory TDM exceptions). The decision notably focused on the initial act of image reproduction for dataset creation and not on any downstream AI training.

In the Czech Republic’s case of S.S. v. Taubel,^[76] the court held that AI-generated outputs lacking sufficient human input do not meet the threshold of authorship required for copyright protection, as prompts constitute ideas rather than protected expression.^[77] This is because prompts themselves, being ideas or abstract instructions, do not constitute protected expressions.

In the German case of GEMA v. OpenAI,^[78] the German ‘society’ GEMA brought proceedings against OpenAI, alleging that the company had used protected song lyrics while training its GPT-4 and GPT-4o AI models that power ChatGPT without obtaining licences. The claim concerned lyrics from well-known German songs, which were argued to have been memorised within the model and could be reproduced almost verbatim through simple prompts. GEMA contended that this amounted to unauthorised reproduction and communication to the public under Sections 15, 16 and 19a of the German Copyright Act. OpenAI responded that its models do not store specific works but learn statistical relationships from training data and that any output is generated dynamically in response to user prompts. It further invoked the EU text-and-data-mining exceptions under Articles 3 and 4 of the Directive (EU) 2019/790 on Copyright in the Digital Single Market, implemented in German law through Sections 60d and 44b UrhG.

The court, however, accepted evidence that training data can become embedded in model weights through ‘memorisation’, allowing the lyrics to be reproduced via simple prompts. Relying on Articles 2 and 3 of the InfoSoc Directive, the court held that such memorisation constituted fixation and reproduction of the works. Consequently, it largely upheld GEMA’s claims, concluding that the presence and output of the lyrics amounted to unauthorised reproduction and communication to the public.^[79]

Several high-profile disputes—including GEMA v Suno,^[80] and France’s National Publishing Union v. Meta^[81]—remain pending. These cases raise core questions: whether TDM exceptions apply where rights holders have opted out (and whether such opt-outs must be machine-readable), how to assess market harm, and when verbatim or near-verbatim outputs cross into infringement.

United Kingdom: Exceptions and Extraterritorial Limits

In the UK, Getty Images v. Stability AI^[82] crystallises the legal and ethical tensions surrounding the use of copyrighted content in training generative models, while also exposing the limits of existing doctrine. Getty alleged large-scale scraping of over 12 million protected images to train Stable Diffusion without consent, asserting copyright, database, trademark, and passing-off claims. Stability AI acknowledged the presence of some Getty material in its training data but argued that outputs are generated through stochastic processes, do not reproduce original works and may qualify as “pastiche” under UK fair dealing. It also raised jurisdictional defences, noting that training occurred outside the UK.

In its November 2025 judgement, the High Court largely rejected the infringement claims before it, particularly secondary copyright claims, after Getty abandoned primary copyright and database right claims due to insufficient evidence of UK-based training.^[83] This left unresolved whether scraping or training within the UK would constitute infringement. The Court nonetheless permitted scrutiny of near-identical outputs and addressed limited trademark issues.

Against this backdrop, UK law provides only a narrow TDM exception under the Copyright, Designs and Patents Act 1988,^[84] confined to non-commercial research with lawful access, leaving most commercial AI training outside a clear statutory safe harbour. This may push UK-based developers towards licensing arrangements or reliance on contested fair dealing arguments. The UK government’s 2024 consultation^[85] proposes a conditional training exception, allowing use of lawfully accessible works unless rights are expressly reserved, supported by transparency and opt-out mechanisms. These proposals directly respond to unresolved issues of consent, provenance, and licensing highlighted in Getty.

Collectively, the case signals that while courts are cautious in extending copyright liability extraterritorially, durable resolution of AI-copyright tensions in the UK will likely depend on legislative reform rather than litigation alone.

India: Fair Dealing, Extraterritorial AI Training, and the Search for Evidentiary Standards

Asian News International (ANI) v. OpenAI,^[86] currently before the Delhi High Court, crystallises three interlocking questions: whether the use of publicly available copyrighted news content to train a large language model infringes the Indian Copyright Act; whether AI-generated outputs can themselves infringe or cause reputational harm (for instance, through fabricated attributions); and whether Indian courts can assert jurisdiction when training and servers are located abroad but the service is accessible and harm occurs within India.^[87]

The issues framed include infringement vs Section 52 “fair dealing” exceptions,^[88] territorial jurisdiction, and evidentiary standards for ownership and direct reproduction. Two court-appointed amici curiae have advanced divergent views; one argues that copying per se may constitute infringement, while the other emphasises the need for the plaintiff to demonstrate ownership and substantially similar outputs. No final judgement has been issued; the case remains ongoing and is widely seen as a potential benchmark for India’s approach to AI training.

Legally and ethically, the courts will need to balance extraterritorial jurisdiction and the “localisation of harm” against limits on regulating foreign-based processing. Key questions include whether intermediate copies made during training constitute actionable reproductions or fall within Section 52 exceptions; what evidentiary threshold is required to link training data to specific outputs; and what remedies—such as deletion, disclosure of training datasets, licensing, compensation, or injunctive relief—are appropriate. These considerations intersect with broader concerns around attribution, consent, remuneration, and transparency.

China: Balancing of Innovation and Copyright in AI Model Training

China’s Xiaohongshu (Trik AI) case^[89] tests whether large-scale scraping and ingestion of artists’ images to train generative models is lawful. Illustrators allege unauthorised reproduction of their works, while Xiaohongshu relies on Article 24 of the Chinese Copyright Law (fair use)^[90] as a defence. This dispute operates within a broader regulatory ecosystem: the 2023 Interim Measures (Article 7^[91]) require AI service providers to use “legal sources,” to train AI models “in accordance with the law” and thus avoid infringing third-party IP. It should obtain consent where personal data is involved and seek to improve training-data quality. The Supreme People’s Court guidance under Article 8^[92] urges a balanced approach that considers purpose, nature, quantity, and market impact when judging “reasonable” uses.

The court is thus positioned to address two core issues central to global AI copyright disputes: whether dataset creation for training constitutes actionable reproduction or falls within fair use, and how to assess the “quantity and quality” of use and any market harm of work or author. Practically and ethically, the court also needs to adjudicate on transparency obligations, creators’ entitlement to attribution or remuneration, how personal-data rules intersect with training, and the effect of regulatory duties (e.g., labelling and using only “legal sources”) on a platform’s defence.

Across jurisdictions, a discernible pattern is emerging in judicial approaches to AI training on copyrighted works. In the United States, courts are increasingly centring the analysis on ‘fair use’, often permitting AI training where it is proven to be ‘transformative’ and without market dilution/harm to copyrighted works. Furthermore, AI LLM companies have to respect US-appropriate technological protection measures (TPMs) set up by copyright holders to prevent crawling and training of such models on copyrighted works.

Decisions such as Thomson Reuters v. Ross and Bartz v. Anthropic indicate that commercial purpose, lawful source acquisition, and market impact are decisive. While courts accept that models operate through “statistical representations,” they remain open to probing whether protected expression is functionally embedded and whether outputs substitute licensed markets. State-law claims are now being estopped through pre-emption by federal laws, and inadequate technical barriers like robots.txt are being largely rejected as appropriate copyright controls. The US case law trajectory thus shows conditional tolerance of AI model training, tightly policed by evidence of substitution, piracy, or reputational harm.

In the EU, courts situate these questions within statutory Text and Data Mining (TDM) exceptions and the Berne three-step test, indicating a more structured approach. Early rulings (e.g., Kneschke v. LAION, DPG Media v Howard) favour defendants where uses are framed as non-commercial research or involve minimal extracts, yet unresolved disputes (GEMA, Meta) foreground opt-outs, remuneration, and cultural market harm, suggesting that consent mechanisms could become relevant in the future. The UK stands out with its narrow non-commercial TDM exception: courts in Getty v. Stability AI has avoided resolving core questions, effectively deferring to legislative reform.

Indian and Chinese case law experience suggests that their individual approaches could focus on jurisdiction, evidentiary threshold of infringement, and potential regulatory handling of this issue. Indian courts are now delving into the aspect of territoriality of training sites and proof of output-based infringement, while China, though not explicitly mentioning “AI training” in its 2011 ‘Supreme People's Court Opinion,’ seeks to balance copyright protection with promotion of “business development and technological innovations,” while remaining committed to “fully protect the basic cultural rights and interests of the people.” To do this, it will make “reasonable” use of works after giving due consideration to various factors like the “nature and purposes of use, the nature of works used, the quantity and quality of the part of works used, impacts of use on potential markets or values, and other factors, provided that such use neither contravenes the normal use of the works nor results in unreasonable damage to the lawful interests of the author.”^[93]

Overall, the global trend seems to point toward no categorical ban or carte blanche acceptance of AI model training on copyrighted works. Courts are increasingly converging on market impact, lawful access, and transparency as the basis on which the legality and legitimacy of AI training will be judged.

Global AI Training Law

Across the three sections—case law developments, comparative statutory frameworks, and policy guidance—clear patterns and fault lines emerge that map the legal terrain for AI training on copyrighted works. The case law survey shows US courts applying established fair-use factors to new contexts: plaintiffs allege large-scale, unlicensed copyrighted content ingestion and market substitution, while defendants invoke transformation, statistical representation, and lack of concrete market harm.

Decisions to date are mixed but instructive. Courts have allowed some claims to proceed (Thomson Reuters v. Ross, GitHub/Copilot disputes, Getty v. Stability AI and Andersen v. Stability AI, among others) where commercial use, direct copying, or near-identical outputs raise material questions. Others (Bartz v Anthropic summary judgement; Kadrey v. Meta) accept that training on lawfully acquired copies can be transformative and may not cause present market harm. Critically, judges are treating the fourth fair-use factor—market effect, including indirect substitution—as potentially decisive where plaintiffs can marshal evidence of displacement or an emerging licensing market but are rejecting speculative market-dilution theories absent concrete proof. Procedurally, multidistrict litigation and discovery battles (e.g., Re: OpenAI, Ziff Davis) emphasise that jurisdiction issues and transparency about datasets and scraping practices are now front-and-centre of such disputes.

The comparative legal analysis shows how these adjudicative questions translate into divergent national rules: the US relies on flexible, fact-sensitive fair-use doctrine, leaving outcomes to litigation; the EU couples statutory TDM exceptions (Articles 3 and 4 of the DSM Directive) with the AI Act obligations, integrating commercial access with opt-outs and transparency mandates. The UK remains narrower (non-commercial TDM) but is contemplating opt-out/licensing reforms; China foregrounds “legal sources”^[94] and administrative oversight, invoking a quasi-reasonableness test; Japan’s “non-enjoyment” data-analysis carve-out and Singapore’s explicit computational-data-analysis exception provide clearer statutory safe harbours (subject to lawful access) that are relatively pro-innovation; India remains traditional, urging application of existing fair-dealing rules and case-by-case adjudication.

These frameworks converge on lawful access, non-circumvention of TPMs and paywalls, and the desirability of licensing but diverge sharply on whether permissive, court-driven doctrine or prescriptive, machine-readable opt-out regimes should govern commercial training.

The section on global copyright policies on AI training earlier in this paper crystallises practical expectations emerging from both courts and statutes: transparency, technical safeguards against verbatim reproduction, designated points of contact and complaint mechanisms, and active development of licensing markets and interoperable, machine-readable rights reservations. Taken together, the three sections signal a hybrid global regime in formation—one that lacks uniformity but offers practical guidance. Developers are expected to avoid pirated sources, document lawful access, respect opt-outs/robots.txt exclusions where possible, and pursue licensing where feasible, while core legal questions remain subject to litigation and future legislation.

The most salient unresolved tensions likely to shape the next wave of disputes are evidentiary: what level of empirical proof suffices to show market displacement; how courts will assess whether statistical representations amount to “copies” or “derivative works”; and how cross-border models reconcile conflicting obligations (US fair use vs. EU opt-outs vs. China’s legality standard). In sum, AI and copyright jurisprudence is rapidly defining the boundary between transformative research and market-substitutive exploitation; policy is hardening around transparency and market solutions; and practitioners who heed licensing and technical mitigation will be best positioned to navigate a litigation-driven path toward regulatory equilibrium.

India’s Way Forward: Balancing Copyright and AI Governance

As AI continues to evolve across the creative, technological, and industrial domains, India stands at a critical juncture in shaping its copyright and AI governance framework. The global legal landscape shaped by landmark rulings such as Anthropic PBC and Kadrey et al. v. Meta Platforms Inc. illustrates the emerging tensions between innovation, data access, and the protection of creative works. These developments offer valuable insights for India’s policy direction. India’s approach could move towards a balanced, innovation-oriented framework that integrates the existing exceptions under the Copyright Act, 1957 with structured licensing mechanisms and institutional safeguards. At present, AI training activities fall within an uncertain legal space where developers may attempt to rely on the Act’s ‘fair dealing provisions’, which permit limited uses for purposes such as research, review, and reporting.

While these provisions could potentially cover certain forms of non-expressive computational analysis, their purpose-specific structure and lack of explicit reference to large-scale data mining create ambiguity for AI developers and rights holders alike. One policy pathway, therefore, lies in clarifying how these existing exceptions interact with AI training while complementing them with a structured licensing framework for commercial-scale model development. The proposal outlined in the Working Paper on Generative AI and Copyright (Part 1) – One Nation One License One Payment provides a potential starting point. Under this model, AI developers would obtain a blanket licence to use lawfully accessed copyrighted material for training, without requiring individual negotiations with each rights holder. Royalties would arise upon commercialisation, with rates determined by a government-appointed committee and subject to judicial review.

A centralised system would manage royalty collection and distribution, reducing transaction costs and providing greater legal certainty for both developers and creators. Such a system could be administered through a national digital repository of copyrighted works, overseen by the Copyright Office and supported through public-private partnerships, enabling the creation of curated and rights-cleared datasets potentially reflecting India’s linguistic and cultural diversity in its AI innovations.

However, licensing mechanisms require careful design. Traditional copyright licensing systems could suffer from high transaction costs, opacity in the functioning of collecting societies, and unequal bargaining power between dissimilarly situated stakeholders.

Similarly, small AI developers and startups could also struggle to afford high licensing fees, potentially entrenching market power among large technology firms. These structural concerns mean that any collective licensing framework must incorporate transparency obligations, equitable revenue distribution mechanisms, and safeguards to ensure access for smaller developers and creators. Policymakers need to therefore treat licensing not as a complete solution but as one component of a broader governance architecture.

Strengthening institutional capacity, promoting dialogue between AI developers and creative industries, and encouraging judicial engagement with emerging disputes will be crucial. By combining clarified copyright exceptions with a carefully designed collective licensing regime, India can create a governance framework that both protects creators and enables the development of globally competitive AI systems.

Debajyoti Chakravarty is Research Assistant, Centre for Digital Societies, Observer Research Foundation.

All views expressed in this publication are solely those of the author, and do not represent the Observer Research Foundation, either in its entirety or its officials and personnel.

Endnotes

^[a] “Transformative use” of a copyrighted work is that which adds something new, with a further purpose or different character, and do not substitute for the original use of the work.

^[1] Cornell Law School, “17 U.S. Code § 107 - Limitations on Exclusive Rights: Fair Use,” Cornell Law School, https://www.law.cornell.edu/uscode/text/17/107 .

^[2] Begin at the Library, “Copyright & Fair Use: Fair Use,” Begin at the Library, https://libraryservices.acphs.edu/c.php?g=531993&p=3639636#:~:text=claim%20fair%20use?-,What%20is%20Fair%20Use?,for%20it%20to%20be%20used.&text=(4)%20the%20effect%20of%20the,to%20a%20fair%20use%20conclusion.

^[3] Penn State, “Fair Use,” Penn State, https://copyright.psu.edu/copyright-basics/fair-use/#:~:text=Section%20107%20of%20the%20Copyright,fair%20or%20is%20not%20fair.

^[4] Copyright.gov, “Chapter 11: Subject Matter and Scope of Copyright,” Copyright.gov, https://www.copyright.gov/title17/92chap1.html#107.

^[5] United States Copyright Office, Copyright and Artificial Intelligence Part 3: Generative AI Training Pre-publication Version a Report of the Register of Copyrights, Washington, D.C: US Copyright Office, 2025, https://www.copyright.gov/ai/Copyright-and-Artificial-Intelligence-Part-3-Generative-AI-Training-Report-Pre-Publication-Version.pdf.

^[6] Copyright Alliance, “Fair Use Exception to Copyright,” Copyright Alliance, https://copyrightalliance.org/education/copyright-law-explained/limitations-on-a-copyright-owners-rights/fair-use-exceptions-copyright/#:~:text=Historically%20the%20first%20factor%20has,of%20fair%20use%20even%20murkier .

^[7] Penn State, “Fair Use,” Penn State, https://copyright.psu.edu/copyright-basics/fair-use/ .

^[8] Office of the General Counsel, “Copyright and Fair Use,” Harvard University, https://ogc.harvard.edu/pages/copyright-and-fair-use .

^[9] Legislation.gov.uk, “Copyright, Designs and Patents Act 1988 UK Public General Acts1988 c.48 Part I Chapter III General Section 29A,” Legislation.gov.uk, https://www.legislation.gov.uk/ukpga/1988/48/section/29A .

^[10] Gov.uk, “Closed Consultation Copyright and Artificial Intelligence,” Gov.uk, https://www.gov.uk/government/consultations/copyright-and-artificial-intelligence/copyright-and-artificial-intelligence .

^[11] EUR-LEX, “Directive (EU) 2019/790 of the European Parliament and of the Council of 17 April 2019 on Copyright and Related Rights in the Digital Single Market and Amending Directives 96/9/EC and 2001/29/EC (Text with EEA relevance.),” EUR-LEX, https://eur-lex.europa.eu/eli/dir/2019/790/oj/eng .

^[12] EUR-LEX, “Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 Laying Down Harmonised Rules on Artificial Intelligence and Amending Regulations (EC) No 300/2008, (EU) No 167/2013, (EU) No 168/2013, (EU) 2018/858, (EU) 2018/1139 and (EU) 2019/2144 and Directives 2014/90/EU, (EU) 2016/797 and (EU) 2020/1828 (Artificial Intelligence Act) (Text With EEA Relevance),” EUR-LEX, https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32024R1689 .

^[13] China Aerospace Studies Institute, “Interim Measures for the Management of Generative Artificial Intelligence Services,” China Aerospace Studies Institute, https://www.airuniversity.af.edu/Portals/10/CASI/documents/Translations/2023-08-07%20ITOW%20Interim%20Measures%20for%20the%20Management%20of%20Generative%20Artificial%20Intelligence%20Services.pdf .

^[14] China Law Translate, “Copyright Law of the PRC (2021 Version),” China Law Translate, https://www.chinalawtranslate.com/en/Copyright-Law-of-the-PRC-(2021-Version)/#_Toc56756782 .

^[15] Law Info China, “Notice of the Supreme People's Court on Issuing the Opinions on Issues Concerning Maximizing the Role of Intellectual Property Right Trials in Boosting the Great Development and Great Prosperity of Socialist Culture and Promoting the Independent and Coordinated Development of Economy,” Law Info China, http://lawinfochina.com/display.aspx?id=9280&lib=law .

^[16] CRIC, “Copyright Law of Japan,” CRIC, https://www.cric.or.jp/english/clj/cl2.html#art47-5 .

^[17] Agency for Cultural Affairs, “General Understanding on AI and Copyright in Japan,” Government of Japan, https://www.bunka.go.jp/english/policy/copyright/pdf/94055801_01.pdf .

^[18] Agency for Cultural Affairs, “General Understanding on AI and Copyright in Japan,” Government of Japan, https://www.bunka.go.jp/english/policy/copyright/pdf/94055801_01.pdf .

^[19] Singapore Statutes Online, “Copyright Act 2021,” Singapore Statutes Online, https://sso.agc.gov.sg/Act/CA2021

^[20] AI Verify Foundation, Model AI Governance Framework for Generative AI, May 2024, Singapore, AI Verify Foundation, 2024, https://aiverifyfoundation.sg/wp-content/uploads/2024/05/Model-AI-Governance-Framework-for-Generative-AI-May-2024-1-1.pdf .

^[21] Copyright Office, “The Copyright Act, 1957,” Government of India, https://www.copyright.gov.in/Documents/Copyrightrules1957.pdf .

^[22] Swati Sharma and Akshit Singla, “Fair Dealing in the Digital Age: Navigating Copyright for News and Online Content in India,” Cyril Amarchand Mangaldas, June 4, 2025, https://corporate.cyrilamarchandblogs.com/2025/06/fair-dealing-in-the-digital-age-navigating-copyright-for-news-and-online-content-in-india/#:~:text=Critically%2C%20Indian%20law%20specifies%20that,fair%20dealing%20defense%20to%20apply.

^[23]European Commission, “The General-Purpose AI Code of Practice,” European Commission, https://digital-strategy.ec.europa.eu/en/policies/contents-code-gpai .

^[24] Intellectual Property Office, “Copyright and AI: Consultation,” UK Government, https://www.gov.uk/government/consultations/copyright-and-artificial-intelligence/copyright-and-artificial-intelligence .

^[25]United States Copyright Office, Copyright and Artificial Intelligence Part 3: Generative AI Training Pre-Publication Version a Report of the Register of Copyrights, Washington D.C: US Copyright Office, 2025, https://www.copyright.gov/ai/Copyright-and-Artificial-Intelligence-Part-3-Generative-AI-Training-Report-Pre-Publication-Version.pdf .

^[26] Congress.gov, “Generative Artificial Intelligence and Copyright Law,” Congress.gov, https://www.congress.gov/crs-product/LSB10922 .

^[27] United States Copyright Office, Copyright and Artificial Intelligence Part 3: Generative AI Training Pre-Publication Version a Report of the Register of Copyrights, Washington D.C: US Copyright Office, 2025, https://www.copyright.gov/ai/Copyright-and-Artificial-Intelligence-Part-3-Generative-AI-Training-Report-Pre-Publication-Version.pdf .

^[28] Tristan Marcelin and Filippo Cassetti, “AI and Copyright: The Training of General-Purpose AI,” European Parliamentary Research Service, April 23, 2025, https://www.europarl.europa.eu/thinktank/en/document/EPRS_ATA(2025)769585 .

^[29]Intellectual Property Office, “Copyright and AI: Consultation,” UK Government, https://www.gov.uk/government/consultations/copyright-and-artificial-intelligence/copyright-and-artificial-intelligence

^[30] Xinhua News Agency, “At the 20th Collective Study Session of the CCP Central Committee Politburo, Xi Jinping Stresses: Persist in Being Self-Reliant, be Strongly Oriented Toward Applications, and Push the Orderly Development of Artificial Intelligence,” CSET, April 28, 2025, https://cset.georgetown.edu/publication/xi-politburo-collective-study-ai-2025/ .

^[31] Bitlaw, “Fair Use in Copyright Law,” Bitlaw, https://www.bitlaw.com/copyright/fair-use.html#:~:text=The%20doctrine%20of%20fair%20use,The%20First%20Amendment .

^[33] Magdalena Serafin, “The EU AI Act and Copyrights Compliance,” Iapp, April 30, 2025, https://iapp.org/news/a/the-eu-ai-act-and-copyrights-compliance .

^[34] Kristy Loke, Paul Triolo, Helen Toner, Johanna Costigan, Scott Singer, Gabriel Wagner, Jason Zhou and Kevin Neville, “Forum: Xi’s Message to the Politburo on AI,” DigiChina, April 30, 2025, https://digichina.stanford.edu/work/forum-xis-message-to-the-politburo-on-ai/ .

^[35] Intellectual Property Office of Singapore, Ministry of Law Singapore, https://www.mlaw.gov.sg/files/Summary_of_Key_Changes_to_Prescribed_Exceptions_in_Part_6__Division_1_of_the_Copyright_Regulations_2021.pdf

^[36] “Singapore: Updated Copyright Regulations on Circumvention of Access Controls,” Baker Mckenzie, March 31, 2025, https://insightplus.bakermckenzie.com/bm/intellectual-property/singapore-updated-copyright-regulations-on-circumvention-of-access-controls

^[37]Department for Promotion of Industry and Internal Trade Ministry of Commerce and Industry Government of India, Working Paper on Generative AI and Copyright Part 1 One Nation One License One Payment

Balancing AI Innovation and Copyright, New Delhi: DPIIT, 2025, https://www.dpiit.gov.in/static/uploads/2025/12/ff266bbeed10c48e3479c941484f3525.pdf

^[38] Ministry of Commerce and Industry, Government of India, https://www.pib.gov.in/PressReleasePage.aspx?PRID=2200741&reg=3&lang=1

^[39] “Action Plan for Global Governance of Artificial Intelligence (Full Text),” Xinhua News Agency, July 26, 2025, https://www.gov.cn/yaowen/liebiao/202507/content_7033929.htm

^[40] Agency for Cultural Affairs Government of Japan, “General Understanding on AI and Copyright in Japan-Overview,” Legal Subcommittee Copyright Subdivision of the Cultural Council Japan Copyright Office, https://www.bunka.go.jp/english/policy/copyright/pdf/94055801_01.pdf .

^[41] FindLaw, “Doe V. Github Inc 2023,” FindLaw, https://caselaw.findlaw.com/court/us-dis-crt-n-d-cal/2200493.html

^[42] Samyak Deshpande, “The John Doe v. GitHub Case Explained,” IndicPacific, August 28, 2024, https://www.indicpacific.com/post/the-john-doe-v-github-case-explained

^[43] Jose Florinio Farcon, “Attribution or Attrition? Doe 1 V. Github, Inc. as a Case for a Robust, Horizontal, Moral Right of Attribution in Gen AI,” SSRN, October 9, 2024, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4946503

^[44] Courtlistner, “InRe Open AI Inc.,” Courtlistner, https://storage.courtlistener.com/recap/gov.uscourts.nysd.640396/gov.uscourts.nysd.640396.1.0.pdf

^[45] McKool Smith, “AI Infringement Case Updates upto April 14, 2025,” McKool Smith, https://www.mckoolsmith.com/newsroom-ailitigation-18#:~:text=Plaintiffs%20alleged%20that%20OpenAI%20used,light%20of%20the%20transfer%20order

^[46] Tal Dickstein and Chloe Gordils, “In Re: OpenAI Inc., Copyright Infringement Litigation,” Loeb and Loeb, June 27, 2025, https://www.loeb.com/en/insights/publications/2025/06/in-re-openai-inc-copyright-infringement-litigation

^[47] Casetext, “Andersen v. Stability AI Ltd.,” Casetext, https://perma.cc/U9VG-XRPV

^[48] Zach Schor, “Andersen v. Stability AI: The Landmark Case Unpacking the Copyright Risks of AI Image Generators,” NYU Law, December 2, 2024, https://jipel.law.nyu.edu/andersen-v-stability-ai-the-landmark-case-unpacking-the-copyright-risks-of-ai-image-generators/

^[49] Kevin Madigan, “Top Takeaways From Order in the Andersen v. Stability AI Copyright Case,” Copyright Alliance, August 29, 2024, https://copyrightalliance.org/andersen-v-stability-ai-copyright-case/

^[50] Tal Dickstein and Edward Delman, “Andersen v. Stability AI Ltd,” Loeb and Loeb, October 30, 2023, https://www.loeb.com/en/insights/publications/2023/11/andersen-v-stability-ai-ltd

^[51] Frank D. D'Angelo and Elena De Santis, “Bartz v. Anthropic PBC,” Loeb and Loeb, June 23, 2025, https://www.loeb.com/en/insights/publications/2025/07/bartz-v-anthropic-pbc

^[52] Universal Music Group, “Universal Music Group and Udio Announce Udio’s First Strategic Agreements for New Licensed AI Music Creation Platform,” Universal, https://www.universalmusic.com/universal-music-group-and-udio-announce-udios-first-strategic-agreements-for-new-licensed-ai-music-creation-platform/

^[53] Frank D. D'Angelo and Elena De Santis, “Bartz v. Anthropic PBC,” Loeb and Loeb, June 23, 2025, https://www.loeb.com/en/insights/publications/2025/07/bartz-v-anthropic-pbc

^[54] Copyright Alliance, “Andrea Bartz, Charles Graeber and Kirk Wallace Johnson, Plaintiffs, V. Anthropic Pbc, Defendant,” United States District Court Northern District of California, https://copyrightalliance.org/wp-content/uploads/2025/06/Bartz-v.-Anthropic-Order.pdf

^[55] Dave Hansen, “The Bartz v. Anthropic Settlement: Understanding America's Largest Copyright Settlement,” Kluwer Copyright Blog, November 10, 2025, https://legalblogs.wolterskluwer.com/copyright-blog/the-bartz-v-anthropic-settlement-understanding-americas-largest-copyright-settlement/

^[56] Caitlin Hadlee and Melissa Yan, “Is it Fair? Lessons From Bartz v Anthropic and Kadrey v Meta,” Hudson Gavin, November 13, 2025, https://www.hgmlegal.com/insights/is-it-fair-lessons-from-bartz-v-anthropic-and-kadrey-v-meta

^[57] Knowledge Centre Data and Society, “United States - Richard Kadrey et al. v. Meta Platforms Inc. - (Artificial) Market Harm as a Decisive Factor,” Knowledge Centre Data and Society, https://data-en-maatschappij.ai/en/publications/united-states-richard-kadrey-et-al-v-meta-platforms-inc-artificial-market-harm-as-a-decisive-factor

^[58] Joseph E. Martineau, Bridget Hoy, Kirk A. Damman, Michael J. Hickey, John B. Greenberg and Benjamin J. Siders, “Generative AI Training and Fair Use: The Anthropic and Meta Decisions,” Lewis Rice, July 10, 2025, https://www.lewisrice.com/publications/generative-ai-training-and-fair-use-the-anthropic-and-meta-decisions#:~:text=In%20Kadrey%20v.,the%20fourth%20fair%20use%20factor.

^[59] Jason L. Haas, “Kadrey v. Meta: The First Major Test of Fair Use in the Age of Generative AI,” ECJ Blogs, May 14, 2025, https://www.ecjlaw.com/ecj-blog/kadrey-v-meta-the-first-major-test-of-fair-use-in-the-age-of-generative-ai-by-jason-l-haas

^[60] Barry Sookman, “Ziff Davis v OpenAI: Key Copyright Litigation Ruling,” McCarthy Tétrault LLP, December 22, 2025, https://www.lexology.com/library/detail.aspx?g=33bc4964-a99b-43e9-8ad7-a4a23540c38c

^[61] Annelise Levy, “OpenAI Wins Partial Dismissal of Ziff Davis Copyright Lawsuit,” Bloomberg Law, December 16, 2025, https://news.bloomberglaw.com/ip-law/openai-wins-partial-dismissal-of-ziff-davis-copyright-lawsuit

^[62] Bleichmar Fonti and Auld LLP, “In re Google Generative AI Copyright Litigation,” BLF Consumer, Privacy and Antitrust, https://www.bfalaw.com/cases-investigations/in-re-google-generative-ai-copyright-litigation

^[63] Anthony Leung, “Legal Update : Concord Music Group v Anthropic - Is It really a Victory for AI Firms?,”Haldanes, April 10, 2025, https://www.lexology.com/library/detail.aspx?g=286e69fd-2e16-4828-a25e-e22aa1878667#:~:text=This%20could%20spark%20some%20interesting,defence%20of%20%E2%80%9Cfair%20use%E2%80%9D

^[64] AI Law and Policy, “Stewart O’Nan V. Databricks,” AI Law and Policy, https://www.ailawandpolicy.com/wp-content/uploads/sites/65/2024/03/Databricks-Inc.pdf

^[65] “NVIDIA Asserts Millette Lacks Standing Due to Failure to Alleged ’A Concrete, Particularized Injury In Fact’ Under Constitution, Simply Based on Scraping Youtube Videos To Train AI,” Chat GPT Is Eating the World, November 16, 2024, https://chatgptiseatingtheworld.com/2024/11/16/nvidia-asserts-millette-lacks-standing-due-to-failure-to-alleged-a-concrete-particularized-injury-in-fact-under-constitution-simply-based-on-scraping-youtube-videos-to-train-ai/

^[66] Meaghan Gragg and Sigrid Jernudd, “Lehrman v. Lovo, Inc.: Voice Actors Take on AI Voice Generation,” HHR Art Law, January 28, 2025, https://www.hhrartlaw.com/2025/01/lehrman-v-lovo-inc-voice-actors-take-on-ai-voice-generation/

^[67] Reuters, “AI Voiceover Company Stole Voices of Actors, New York Lawsuit Claims,” The Hindu, May 17, 2024, https://www.thehindu.com/sci-tech/technology/ai-voiceover-company-stole-voices-actors-new-york-lawsuit-claims/article68185205.ece

^[68] ChatGPT is Eating the World, “Karissa Vacker, Mark Boyett, Brian Larson, Iron Tower Press, Inc., and Vaughn Heppner, Plaintiffs, V. Eleven Labs, Inc., Defendant,” ChatGPT is Eating the World, https://chatgptiseatingtheworld.com/wp-content/uploads/2024/08/Vacker-v-Eleven-Labs-COMPLAINT.pdf

^[69] “Vacker et al v. ElevenLabs, Inc.,” Law 360, August 29, 2024, https://www.law360.com/cases/66d09d62e010cd09bb6de75a

^[70] De Rechtspraak, “In the Case of DPG Media B.V., and Others V. Knowledge Exchange B.V. Trading as Howardshome,” De Rechtspraak, https://uitspraken.rechtspraak.nl/details?id=ECLI:NL:RBAMS:2024:6563

^[71] Etienne Valk and Iris Toepoel, “ DPG Media et al Vs. Howards Home – A National Ruling on DSM’s Press Publishers' Rights and TDM Exceptions,” Kluwer Copyright Blog, January 16, 2025, https://legalblogs.wolterskluwer.com/copyright-blog/dpg-media-et-al-vs-howardshome-a-national-ruling-on-dsms-press-publishers-rights-and-tdm-exceptions/

^[72] Euro-Lex, “Directive (EU) 2019/790 of the European Parliament and of the Council of 17 April 2019 on Copyright and Related Rights in the Digital Single Market and Amending Directives 96/9/EC And 2001/29/EC (Text With EEA Relevance,” Euro-Lex, https://eur-lex.europa.eu/eli/dir/2019/790/oj

^[73] Andres Guadamuz, “Can a Dutch Case About RSS Teach Us Anything About AI Copyright?,” Techno Lama, December 1, 2024, https://www.technollama.co.uk/can-a-dutch-case-about-rss-teach-us-anything-about-ai-copyright

^[74] OpenJur, “Regional Court of Hamburg, Judgment of 27.09.2024 - 310 O 227/23,” OpenJur, https://openjur.de/u/2495651.html

^[75] WIPO, “Germany DE048-J 2024 WIPO IP Judges Forum Informal Case Summary – Hamburg Regional Court, Germany [2024]: Robert Kneschke V. LAION E.V., Case No. 310 O 227/23,” WIPO, https://www.wipo.int/wipolex/en/text/592042

^[76] Media Report, “SS V. Taubel,” Media Report, https://mediareport.nl/wp-content/uploads/2024/04/praag-en.pdf

^[77] Jan Ježek, “Czech Republic - S. Š. v Taubel Legal, Advokátní Kancelář S.R.O. , 11 October 2023,” CMS, May 15, 2024, https://cms.law/en/alb/publications/artificial-intelligence-and-copyright-case-tracker/czech-republic-s.-s.-v-taubel-legal-advokatni-kancelar-s.r.o

^[78] Higher Regional Court of Munich, Government of Germany, https://ifrro.org/resources/documents/General/German_Court_OpenAI_Memory_Output_Infringe_Copyright_NOV25.pdf

^[79]Simon Hembt, “Landmark Ruling of the Munich Regional Court (GEMA V OpenAI) on Copyright and AI Training,” Bird and Bird, November 14, 2025, https://www.twobirds.com/en/insights/2025/landmark-ruling-of-the-munich-regional-court-(gema-v-openai)-on-copyright-and-ai-training

^[80] Gema, “Fair Remuneration Demanded: GEMA Files Lawsuit Against Suno Inc.,” Gema, https://www.gema.de/en/w/press-release-lawsuit-against-suno

^[81] David Mouriquand, “French Publishers and Authors Sue Meta Over Copyright Works Used in AI Training,” Euro News, March 13, 2025, https://www.euronews.com/culture/2025/03/13/french-publishers-and-authors-sue-meta-over-copyright-works-used-in-ai-training#:~:text=Vincent%20Montagne%2C%20the%20president%20of,companies%20over%20data%20and%20copyright

^[82]Sophie Goossens, Jean-Luc Juhan, Susan Kempe-Müller, Alfonso Lamadrid, Myria Saarinen, Tim Wybitul, Gail E. Crawford, James Lloyd and Fiona M. Maclean, “Getty Images v. Stability AI: English High Court Rejects Secondary Copyright Claim,” Latham and Watkins, November 13, 2025, https://www.lw.com/en/insights/getty-images-v-stability-ai-english-high-court-rejects-secondary-copyright-claim#fn1

^[83] Royal Courts of Justice, “Getty Images (Us) Inc and Others V. Stability AI Limited,” Royal Courts of Justice, https://www.judiciary.uk/wp-content/uploads/2025/11/Getty-Images-v-Stability-AI.pdf

^[84] UK Legislation, “Copyright, Designs and Patents Act 1988,” UK Legislation, https://www.legislation.gov.uk/ukpga/1988/48/section/29A

^[85] Intellectual Property Office, “Copyright and AI: Consultation,” UK Government, https://www.gov.uk/government/consultations/copyright-and-artificial-intelligence/copyright-and-artificial-intelligence

^[86] Espie Angelica A. De Leon, “OpenAI: No AI Training in India, No Copyright Infringement,” Asia IP, April 28, 2025, https://asiaiplaw.com/section/news-analysis/openai-no-ai-training-in-india-no-copyright-infringement

^[87] Bhavini Srivastava, “Using Public Data to Train Chatgpt is Not Commercial Use: OpenAI to Delhi High Court,” Bar and Bench, April 29, 2025, https://www.barandbench.com/news/using-public-data-to-train-chatgpt-is-not-commercial-use-openai-to-delhi-high-court

^[88] India Code, “Section 52 Certain Acts Not to be Infringement of Copyright,” India Code, https://www.indiacode.nic.in/show-data?actid=AC_CEN_9_30_00006_195714_1517807321712&sectionId=14572&sectionno=52&orderno=70

^[89] Ye Zhanhang and Jiang Zuer, “Xiaohongshu Accused of Using Users’ Artwork to Train Its AI,” Sixth Tone, August 14, 2023, https://www.sixthtone.com/news/1013514

^[90] China Law Translate, “Copyright Law of the PRC (2021 Version),” China Law Translate,

https://www.chinalawtranslate.com/en/Copyright-Law-of-the-PRC-(2021-Version)/#_Toc56756774

^[91] China Law Translate, “Interim Measures for the Management of Generative Artificial Intelligence Services,” China Law Translate, https://www.chinalawtranslate.com/en/generative-ai-interim/

^[92] Law Info China, “Notice of the Supreme People's Court on Issuing the Opinions on Issues Concerning Maximizing The Role of Intellectual Property Right Trials in Boosting the Great Development and Great Prosperity of Socialist Culture and Promoting the Independent and Coordinated Development of Economy [Effective],” Law Info China, http://lawinfochina.com/display.aspx?id=9280&lib=law

^[93] Law Info China, “Notice of the Supreme People's Court on Issuing the Opinions on Issues Concerning Maximizing The Role of Intellectual Property Right Trials in Boosting the Great Development and Great Prosperity of Socialist Culture and Promoting the Independent and Coordinated Development of Economy [Effective],” Law Info China, http://lawinfochina.com/display.aspx?id=9280&lib=law

^[94] China Law Translate, “Interim Measures for the Management of Generative Artificial Intelligence Services,” China Law Translate, https://www.chinalawtranslate.com/en/generative-ai-interim/

The views expressed above belong to the author(s). ORF research and analyses now available on Telegram! Click here to access our curated content — blogs, longforms and interviews.

PREV NEXT

Author

Debajyoti Chakravarty

Debajyoti Chakravarty is a Research Assistant at ORF’s Center for New Economic Diplomacy (CNED) and is based at ORF Kolkata. His work focuses on the use ...

Occasional PapersPublished on May 18, 2026 PDF Download

AI, Copyright, and the Future of Creativity

Introduction

Copyright Law in AI Training: A Comparison of Seven Jurisdictions

United States: The ‘Fair Use’ Doctrine and Doctrinal Flexibility

United Kingdom: Lawful Access and Emerging Opt-Out Framework

European Union: Structured Exceptions and Obligations Consolidated Under the AI Act

China: Legal Source Requirement and State-Led Governance

Japan: Non-Enjoyment Test and Context-Based Limitations

Singapore: Permissive Computational Data Analysis Exception

India: Application of Existing Copyright Law

Comparative Trends, Convergences, and Divergences

Global Copyright Policies on AI Training

Judicial Approaches to Copyright in AI Training

United States: ‘Fair Use’ on Trial

European Union: Judicial Understanding of TDM and its Exceptions

United Kingdom: Exceptions and Extraterritorial Limits

India: Fair Dealing, Extraterritorial AI Training, and the Search for Evidentiary Standards

China: Balancing of Innovation and Copyright in AI Model Training

Global AI Training Law

India’s Way Forward: Balancing Copyright and AI Governance

Endnotes

Author

Debajyoti Chakravarty

Related Search Terms

Publications

Counter-OSINT and Its Implications for India’s Security Strategy

International Affairs | Internal Security

Jun 16, 2026

Who Will Make India Rich?

Indian Economy

Jun 15, 2026

Indexing

Contributors

Debajyoti Chakravarty

Essay Series

Long-form

Progammes & Centres

Location

About ORF

Engage

People

AI, Copyright, and the Future of Creativity

Occasional PapersPublished on May 18, 2026 PDF Download

AI, Copyright, and the Future of Creativity

Introduction

Copyright Law in AI Training: A Comparison of Seven Jurisdictions

United States: The ‘Fair Use’ Doctrine and Doctrinal Flexibility

United Kingdom: Lawful Access and Emerging Opt-Out Framework

European Union: Structured Exceptions and Obligations Consolidated Under the AI Act

China: Legal Source Requirement and State-Led Governance

Japan: Non-Enjoyment Test and Context-Based Limitations

Singapore: Permissive Computational Data Analysis Exception

India: Application of Existing Copyright Law

Comparative Trends, Convergences, and Divergences

Global Copyright Policies on AI Training

Judicial Approaches to Copyright in AI Training

United States: ‘Fair Use’ on Trial

European Union: Judicial Understanding of TDM and its Exceptions

United Kingdom: Exceptions and Extraterritorial Limits

India: Fair Dealing, Extraterritorial AI Training, and the Search for Evidentiary Standards

China: Balancing of Innovation and Copyright in AI Model Training

Global AI Training Law

India’s Way Forward: Balancing Copyright and AI Governance

Endnotes

Author

Debajyoti Chakravarty

Related Search Terms

Publications

International Affairs | Internal Security

Jun 16, 2026

Jun 15, 2026

Indexing

Contributors

Debajyoti Chakravarty