The interaction of artificial intelligence and copyright law has become one of the most contentious tech policy debates in the United Kingdom, centering on whether AI developers should be permitted to train their models on copyrighted material without explicit consent or remuneration. This debate has exposed a deep fracture between the creative industries, which seek to protect their intellectual property from unauthorised commercial exploitation, and tech companies. The academic and library sectors are also impacted, and argue that overly restrictive copyright laws hinder scientific research and the UK's sovereign AI capabilities. In 2024, the UK government proposed a broad text and data mining (TDM) exception to copyright that would have allowed AI companies to use publicly available copyrighted material for training, offering creators only an "opt-out" mechanism, similar to the exception introduced in Europe. This proposal faced intense opposition from across the creative sector. Trade unions representing writers, musicians, performers, and journalists argued that such an exception would effectively expropriate their members' work for the commercial benefit of tech giants. A report from the House of Lords Communications and Digital Committee, warned that generative AI posed a "clear and present danger" to the £124 billion creative economy. The government abandoned the opt-out model in March 2026, opting instead to build a stronger evidence base before pursuing any copyright reform. Conversely, the academic and library sectors have raised significant concerns that the UK's current TDM exception, which is strictly limited to non-commercial research, is too narrow. Universities and research libraries occupy a dual role as both creators of vast datasets and beneficiaries of TDM exceptions. They argue that the current legal framework restricts their ability to computationally analyse the very research they produce, thereby hobbling the UK's "AI for Science" strategy. Advocacy groups have highlighted a "triple payment" problem, wherein publicly funded research is handed over to publishers, who then charge universities substantial subscription fees and demand additional payments for specific TDM licences. This tension is further complicated by the commercial practices of major academic publishers. While publishers often restrict universities from using subscribed databases for AI training, they have simultaneously entered into lucrative, multi-million-dollar licensing agreements to sell access to this academic content to commercial AI developers. Furthermore, academics have accused publishers of actively steering authors away from permissive open-access licences towards more restrictive variants. By doing so, publishers retain the exclusive commercial rights necessary to strike these AI training deals, often without consulting the original authors or offering them any additional remuneration. This dynamic has not only reopened debates within the Open Access movement but has also created complex legal scenarios where publishers, rather than authors, control the terms of copyright litigation against major tech companies. == Training on copyrighted material == The question of whether AI developers should be permitted to train their models on copyrighted material without payment or consent has been one of the most contentious policy debates in the UK AI landscape. In 2024, the then-Conservative government proposed a broad text and data mining (TDM) exception that would have allowed AI companies to use any publicly available copyrighted material for training purposes, with creators able only to "opt out" of having their work used. This proposal provoked intense opposition from writers, musicians, visual artists, publishers, and broadcasters, who argued it would effectively expropriate their intellectual property for the commercial benefit of AI companies. The debate over text and data mining exceptions extends significantly beyond generative AI and the creative industries, implicating a wide range of scientific, industrial, and academic research applications. TDM is a foundational process for analysing large datasets to identify patterns, trends, and correlations, which is heavily utilised in fields such as medical research, climate modelling, and financial services. In the scientific and academic sectors, researchers rely on TDM to process vast amounts of published literature. For example, in biomedical research, TDM is used to accelerate drug discovery, identify new uses for existing medicines, and extract insights from clinical notes and genomic datasets. However, the application of traditional copyright frameworks to scientific literature has been criticised by academics. Researchers argue that scientific writing is intended to convey factual, verifiable information rather than creative originality, and that copyright restrictions on TDM hinder reproducibility, validation, and the advancement of science. The current UK copyright exception for TDM (Section 29A of the Copyright, Designs and Patents Act 1988) is limited strictly to non-commercial research, which creates barriers for public-private research partnerships and commercial scientific development. Beyond academia, non-generative AI and TDM are critical to various industrial and commercial operations. In the financial services sector, TDM is employed to monitor transactions, detect fraud, and analyse market feeds. Other non-generative applications include search engine indexing, plagiarism detection software, and media monitoring. A 2026 report by Public First estimated that 19% of UK businesses use specialised TDM tools, and that a restrictive copyright regime requiring licenses for all copyrighted content could cost the UK economy £220 billion in lost AI-driven GDP growth by 2035 compared to a broad commercial TDM exemption. Industry advocates argue that the lack of a commercial TDM exception in the UK creates legal uncertainty that stifles innovation across these broader, non-generative applications of data analysis. === Tech and AI industry positions === The technology and artificial intelligence industries lobbied for a broad text and data mining (TDM) exception to UK copyright law, arguing that such an exception is essential for the UK to remain globally competitive in AI development. Industry bodies such as techUK have argued that without a TDM exception, the UK risks becoming an "AI taker rather than an AI maker," as developers will relocate training operations to jurisdictions with more permissive copyright regimes, such as the United States, Japan, Singapore, and the European Union. During the UK government's 2024–2025 consultation on copyright and AI, major AI developers and trade associations strongly supported "Option 2" (a broad TDM exception) or "Option 3" (a TDM exception with an opt-out mechanism). OpenAI stated in its consultation response that a broad TDM exception is "necessary to drive AI innovation and investment in the UK," arguing that developers should be permitted to train models on lawfully accessed copies without further distribution. The Computer and Communications Industry Association (CCIA) similarly argued that restricting TDM to non-commercial development would undermine the government's ambitions for the UK tech sector and frustrate partnerships between commercial entities and research institutions. Tech industry advocates have also highlighted the economic implications of copyright policy. According to analysis by the think tank UK Day One, adopting an overly restrictive licensing-only approach could result in the UK economy losing up to £182 billion over 20 years, whereas a broad TDM exception could generate a positive impact of £131.61 billion over the same period. Following the government's March 2026 decision to drop plans for a TDM exception in favour of a market-led licensing approach, techUK's Deputy CEO Antony Walker criticised the move, stating that "copyright material cannot be used for AI development and training without permission" under the current framework, which he argued would push AI model training to the US. === Creative sector and political opposition to text and data mining === In March 2026, the House of Lords Communications and Digital Committee published a report, AI, Copyright and the Creative Industries, which concluded that the creative industries face "a clear and present danger from generative AI" and that it would be "a very poor bet" for the government to weaken copyright protections to attract AI investment. The Committee noted that the creative industries contributed £124 billion to the UK economy in 2023 and employed 2.4 million people, compared to the AI sector's £12 billion GVA and 86,000 employees in 2024. The Committee called on the government to develop a "licensing-first" regime underpinned by mandatory transparency requirements, and to rule out any new commercial TDM exception with an opt-out model. Tra
Read more →