GenAI developments pose existential questions for publishers
Generative AI Initiative Newsletter Blog | 13 August 2025
Greetings, everyone! I hope you have had a chance to enjoy a bit of a break around this time of the year before plunging back into the fast-changing world of GenAI.
I’d like to tell you today about a few interesting things that have caught my eye recently. They carry potentially serious implications for our business.
One is a threat that we perhaps did not anticipate when we paywalled our content and assumed it was safe from bots. The other, much more in line with the sunny weather we are enjoying in the northern hemisphere, is a ray of hope for any news brand trying to create new products — and new revenue streams — with AI.
Thanks for reading,
Sonali
Chatbots can replicate paywalled content
ChatGPT, Perplexity, and Grok can provide summaries of paywalled content, according to a recent investigation. They replicate content from paywalled news articles without actually accessing the content behind the paywalls.
Instead, they use publicly available fragments — such as quoted snippets, social media re-posts, archives, metadata, or archived fragments — and reconstruct full-text or accurate summaries.
This practice is often effective even for content behind paywalls like The Atlantic, The New York Times, or The Financial Times — news brands that have invested extensively in building a hard paywall that is difficult to circumvent.
Is this report, by Henk Van Ess, an internationally recognised expert in online research methods, true? I decided to test it. The blog post I linked to above is paywalled. So, purely for the purpose of verification, I posted the link in ChatGPT and asked for a summary of it.
The result above is what I got, and it roughly matches what Van Ess himself has said about the investigation. Perplexity told me much the same, adding: “While older debates focused on AI training on paywalled or copyrighted content, the latest and more urgent concern is that AI bots are exploiting the public digital ecosystem to deliver premium content.”

As Van Ess says: We see “AI systems performing real-time searches to actively reconstruct paywalled articles from live, untrained data sources — content they’ve never encountered during their original training.
“Most chatbots have rules not to break paywalls, and say so loudly, but the internal reasoning documents obtained during this investigation show they’re systematically planning and executing these circumvention operations while maintaining plausible deniability about their methods.”
The report found AI systems successfully reconstructed about half of paywalled content across a sampling of top-tier publications, especially popular stories that have already been widely discussed online.
OK, fine, you say. It’s only half of the content. But take a moment to think about how much content your publication produces — and how much of that the typical reader reads anyway. Is it really more than 50%?
This comes just days after Cloudflare, a prominent content delivery network, said it would block bots from accessing publishers’ content by default and instead ask them to “pay per crawl.”
Cloudflare’s supporters include Condé Nast, Dotdash Meredith (now People), Ziff Davis, The Associated Press, Gannett, The Atlantic, Fortune, and Time. This development has the potential to hinder AI chatbots’ ability to harvest data for training and search purposes.
But, as you can see, the tech companies are not accessing the content directly in order to know what the article said. Instead, they are crawling social media sites and other publicly available forums — and then using the power of generative AI to reassemble the gist of the articles based on screenshots and reader comments.
And also: It appears ChatGPT can now autonomously bypass Cloudflare’s “I am not a robot” test, which is one of the most common security measures employed by sites to block automated traffic. If the LLM can now deceive online verification systems, it means Web sites now need to reevaluate their human-testing methods. (To be fair, LLM agents are perhaps not robots in the strictest sense of the word, but the test is really to see if they are humans or machines.)

How does one deal with this?
We will be addressing these and other existential questions that the news media industry faces at our Media and Tech Week in San Francisco in October, where we will be hearing from the vice-president of product at Cloudflare and executives from Scalepost and Prorata — two companies that provide ways for publishers to charge Big Tech for using content for their LLMs.
If you have read this far, this is a week that you’ll want to spend with us.
GenAI as the engine for new products — and revenue
Speaking of excellent conferences: One of the many outstanding speakers at INMA’s Mumbai conference in July was the head of digital design and product at The Times of India, Rohit Garg, who outlined dozens of ways AI is helping one of India’s largest news publishers go faster and further than ever before.
“We always had diverse user needs,” Garg said. “We always had volume. We always had velocity. We always had consistency that we were creating at scale. What AI is helping us do is doing all of these things more efficiently. It helps us ensure that we don't crack under chaos.”
That sounds like an efficiency play, right? But at Times Internet, AI is actually helping the team create new personalised products, with an eye to unlocking new revenue.
For example, they have created an AI anchor (which is, incidentally, quite common in the Indian news industry) for a streaming financial news show.
The next step is personalisation.
“What we think and what we are researching is a future wherein, imagine you as a person put in your five, seven, 10 stocks and you pay for it. And then there is a live anchor which generates this summary, this synopsis specifically for you,” he said. “This can run 24/7. This minimises editorial workload.”
The Times is also using AI to drive personalised push notifications — where different people receive different alerts based on their preferences. The headlines are also rewritten for different types of audiences to increase engagement.
How about news products for meeting a different user need? The Times’ team has created a completely AI-generated satirical take on news in a scroll-first format for social media, where it has garnered more than 1.5 million views in a month.

Another creative use of GenAI: creating a crossword puzzle based on the previous day’s news as well as a new game called Connect.
“The advent of AI has helped us create some of these games at a much faster pace than we were able to do before. Till some time back, we were only able to create games like Sudoku or crosswords using AI. Now, we are using AI to think of new games. A recent example is the game called Connect, which went live last week. It took us 15 days to ideate and 15 days to implement it.
“Within a month, we created a game which never existed earlier, and we have 1,000 copies of it stored in our database because the AI algorithm has helped us create 1,000 copies which are unique to each other. And we are already seeing great traction on that game. It is, in fact, the best-performing game right now on our platform.”
The Times has also used GenAI to create a product with strong audio-visual elements to reach new audiences.
“Can we have news rewritten for young minds between 8 to 11 years old wherein the news is written to make sure that it resonates with them? Not the Gen Zs but somebody who is in school right now. It’s a kids-first AI rewriting engine. All the stories are rewritten keeping kids in mind,” Garg said.
In advertising, The Times has further improved upon its ad chat product, which answers readers’ questions about the product being advertised instead of displaying simply a box ad or a banner. It is now integrated with WhatsApp, a platform used by more than 850 million people in India.

“This also leads to automated lead capture,” Garg said, while the back end captures insights from the conversations.
Worthwhile links
- GenAI and scaling: Why is it so hard? Ah, read about the Garbage Can model and the Bitter Lesson and how they come together in organisations. I got serious déjà vu.
- GenAI and scaling II: Many employers still lack AI policies, causing employees to secretly use bots, disclosing sensitive information.
- GenAI and deals: Amazon’s New York Times deal is worth US$20 million to US$25 million a year, The Wall Street Journal reports.
- GenAI and video and chat: Now, we have a conversational video chatbot?
- GenAI and bias: LLMs consistently advise women to ask for lower salaries than men, even when both have identical qualifications.
- GenAI and bias II: How to mitigate bias that is built into models?
- GenAI and jobs: Which jobs are most likely to be automated away?
- GenAI and unexpected behaviour: Has your LLM been acting moody and emotional recently? Sucking up to you or making up facts? Some insights into why.
- GenAI and language: Frequent users of generative AI tools are beginning to adopt their linguistic style in everyday conversations.
- GenAI and humans: An AI model appeared in an advertisement in Vogue, sparking an uproar.
- An AI diversion: A CEO fired his entire team of developers because he was using Claude Code. But it’s not as simple as that.
About this newsletter
Today’s newsletter is written by Sonali Verma, based in Toronto, and lead for the INMA Generative AI Initiative. Sonali will share research, case studies, and thought leadership on the topic of generative AI and how it relates to all areas of news media.
This newsletter is a public face of the Generative AI Initiative by INMA, outlined here. E-mail Sonali at sonali.verma@inma.org or connect with her on INMA’s Slack channel with thoughts, suggestions, and questions.