You have no items in your cart. Want to get some nice things?
Go shoppingInterview / Publishing / Future Archives
Jack Newton, Head of Content Protection and Enforcement at the Publishers Association, talks to Litro about giant pirate networks, AI training, shadow libraries, takedown systems and why copyright protection is becoming central to publishing’s future.
A publisher used to think of piracy as a leak: a file copied, uploaded, shared and downloaded outside the proper channels. The damage was serious, but the shape of the problem was at least recognisable. There was an infringing copy. There was a site hosting it. There was a takedown notice, a rights holder, an enforcement trail.
That world has not disappeared. If anything, it has grown larger and more sophisticated. Jack Newton, Head of Content Protection and Enforcement at the Publishers Association, describes “giant pirate networks” carrying millions of infringing books and scientific articles, using increasingly organised methods to profit from other people’s work. But the sharper problem now is what those networks feed.
In the age of large language models, piracy is no longer only about illegal access. It is about input. Unauthorised copies of books, academic works and other protected material can become training material, absorbed into systems that generate value at scale while leaving authors, researchers and publishers without permission, transparency or payment.
That is why copyright has moved from the back office to the centre of publishing’s future. The question is no longer simply whether a publisher can remove an illegal file after it appears online. The question is whether books can carry enough ownership, permission and provenance signals to remain visible, attributable and defensible in systems that were not built around consent.
For smaller publishers, the tension is especially sharp. They need their books to be discoverable. They need search engines, retailers, libraries, reviewers and readers to find their work. But visibility without control can become exposure. A book that is easy to find but easy to scrape, misattribute, copy or feed into unauthorised systems may end up creating value for everyone except the people who made it possible.
Newton’s work sits exactly at that fault line. Since joining the Publishers Association in 2022, after years in anti-piracy roles across sport, academic publishing and content protection, he has worked with publishers, law enforcement and government on the practical realities of digital infringement. His brief includes piracy, takedown systems, shadow libraries, enforcement strategy and the growing collision between copyright and AI.
He is clear on one point: the term “shadow library” is misleading. These are not libraries in the true sense. Genuine libraries buy, license, preserve and circulate books within a public-interest framework. Pirate networks do something else entirely. They strip books and research of permission, context and lawful exchange, often at a scale comparable to some of the largest information platforms in the world.
For Litro’s Future Archives strand, Newton’s answers sharpen a recurring question: what does it mean to preserve literary culture when the systems around books are changing faster than the protections around them? With independent publishers, the issue is visibility. With museums, it is institutional memory. With content protection, the question becomes harder still: how do publishers defend the record itself when copying, training, recommendation and monetisation can happen at machine speed?
The future archive is not only what survives on a shelf. It is also the network of rights, records, permissions, metadata, enforcement systems and trusted institutions that allow a work to be recognised as itself. Without that, discoverability risks becoming extraction. Protection, in this sense, is not an obstacle to access. It is part of the infrastructure that makes legitimate access possible.
“Copyright-protected works should not be used to train AI without permission and fair payment.”
About Jack Newton
Jack Newton is Head of Content Protection and Enforcement at the Publishers Association. He is a seasoned anti-piracy expert with nearly 18 years of experience protecting intellectual property across the creative industries.
He first entered content protection in 2008, working for a consultancy that specialised in protecting brand and live streaming content for major sporting organisations around the world. In 2014, he joined Springer Nature as Senior Anti-Piracy Manager. In 2018, as part of his role at Springer Nature, he began a secondment at the City of London Police Intellectual Property Crime Unit, where he helped raise awareness of intellectual property theft within the publishing sector.
Since joining the Publishers Association in 2022, Newton has worked with members, law enforcement and government on piracy issues, including action against major piracy platforms such as Z-Library.
On pirate networks and AI
Eric Akoto, Litro
What are the main content protection issues publishers are dealing with at the moment?
The two major challenges facing the publishing industry, Newton says, are major pirate networks and AI language models — and the connection between the two.
He describes giant pirate networks as websites, or groups of websites, engaged in the mass illegal sharing of books and scientific articles.
“The level of infringing cases on these pirate sites can run into the millions,” he says, “and the networks use sophisticated methods to profit from piracy.”
The second issue is the unauthorised use of copyright-protected works, whether sourced from major pirate networks or elsewhere, in the training, development and operation of AI models.
Newton says this has drawn widespread attention and remains a serious concern for the Publishers Association, its members and the wider industry.
“It has caused, and continues to cause, significant harm to the creative, human and financial investment of authors, creators, researchers, academics and publishers.”
On copyright after AI
Eric Akoto, Litro
How has AI changed the conversation around copyright, piracy and enforcement?
Newton does not frame AI as a total rupture. Instead, he says it has intensified the conversation and, in some cases, raised the stakes.
The Publishers Association, he explains, supports publishers and industry partners as the sector adapts to rapid developments in AI. Copyright and AI remain central to that work.
The PA has been working on these questions for years, including through industry roundtables, evidence to the Lords Communications and Digital Committee, and engagement with parliamentarians, ministers and officials.
“Our position is clear,” Newton says. “Copyright-protected works should not be used to train AI without permission and fair payment.”
The organisation is urging government to legislate for transparency on training data, and supports licensing as the route for AI developers to use publishers’ high-quality content.
For Newton, giant pirate networks form part of the backdrop to AI language model infringement. Reducing harm at source, he argues, has a wider impact.
“Work to reduce the harmful impact of shadow libraries is and will be a key part of the PA’s content protection strategy for years to come.”
“Copyright-protected works should not be used to train AI without permission and fair payment.”
On shadow libraries
Eric Akoto, Litro
What should publishers understand about shadow libraries and unauthorised use of books or content?
Newton begins with language. The phrase “shadow library,” he argues, gives pirate networks a legitimacy they do not deserve.
“It is important to make the distinction that these pirate networks are not libraries in the true sense of the word,” he says. “Let me be very clear: these are not libraries at all.”
Genuine libraries, he notes, license and buy books and serve a legitimate purpose as centres for information and storytelling. The Publishers Association supports that role.
Pirate networks, by contrast, operate at a staggering scale. Newton points to Wikipedia, which has more than 65 million webpages and is regularly ranked among the world’s most visited sites. Some pirate networks allegedly contain tens of millions of research papers, with figures cited at more than 64 million and more than 95 million.
“The sheer size of these pirate networks is staggering,” he says.
The way these networks obtain content varies. Some material is user-uploaded. Some comes from scanning physical books. In some cases, Newton says, content is stolen from publisher databases through compromised login credentials from universities and institutions.
The operators are often unknown, or based in countries where enforcement is difficult.
“The way these pirate networks are set up goes beyond the norm of just a pirate site,” Newton says. “What they are becoming is akin to an organised criminal network or organised crime group.”
On discoverability and protection
Eric Akoto, Litro
What practical steps can smaller publishers take to protect their work while still keeping books discoverable?
For Newton, the first step is often procedural but important: send a takedown notice.
“While it may not always remove the content immediately,” he says, “it helps evidence the harm and, when combined with reports from other publishers, builds a clearer picture of the scale and seriousness of a threat.”
He acknowledges that many publishers do not have the resources to manage this alone. The Publishers Association offers members access to a Copyright Infringement Portal, providing notice-and-takedown services.
Protection, however, does not remove the need for discoverability. Newton points to one practical step on that side: keeping on top of search engine optimisation to maintain and enhance a work’s visibility.
That connection matters. A publisher cannot simply disappear behind locked doors. Books need to be found, reviewed, bought, borrowed, cited, taught and recommended. But the systems that make a work visible must also preserve enough context and rights information to stop visibility turning into extraction.
When Discoverability Becomes Extraction
Newton’s answers make clear that content protection is no longer a narrow enforcement function. It sits at the intersection of copyright, technology, metadata, licensing, search visibility and public trust.
That is the hard problem for publishers now. The same systems that help readers find books can also expose those books to scraping, copying, misattribution and unauthorised reuse. The same metadata that makes a work legible to retailers and search engines may also need to carry stronger signals about authorship, permission, licensing and provenance.
For small and independent publishers, this is not an abstract policy debate. It is an operational question: how do you keep work visible without leaving it undefended?
The future archive will not be protected by sentiment alone. It will depend on records, rights, enforcement, licensing, trusted infrastructure and the practical discipline of making sure a work remains findable without being stripped of the people and permissions behind it.



