Reddit escalates its fight against AI bots

With AI eating the public web, Reddit is going on the offensive against data scraping.

By Alex Heath, a deputy editor and author of the Command Line newsletter. He has over a decade of experience covering the tech industry.

Jun 25, 2024, 9:49 PM UTC

If you buy something from a Verge link, Vox Media may earn a commission. See our ethics statement.

Illustration by William Joel / The Verge

In the coming weeks, Reddit will start blocking most automated bots from accessing its public data. You’ll need to make a licensing deal, like Google and OpenAI have done, to use Reddit content for model training and other commercial purposes.

While this has technically been Reddit’s policy already, the company is now enforcing it by updating its robots.txt file, a core part of the web that dictates how web crawlers are allowed to access a site. “It’s a signal to those who don’t have an agreement with us that they shouldn’t be accessing Reddit data,” the company’s chief legal officer, Ben Lee, tells me. “It’s also a signal to bad actors that the word ‘allow’ in robots.txt doesn’t mean, and has never meant, that they can use the data however they want.”

My colleague David Pierce recently called robots.txt “the text file that runs the internet.” Since it was conceptualized in the early days of the web, the file has primarily governed whether search engines like Google can crawl a website to index it for results. For the last 20 years or so, the give-and-take — Google sending traffic in exchange for the ability to crawl — mostly made sense for everyone involved. Then, AI companies started ingesting all the data they could find online to train their models.

Start your Command Line free trial now to continue reading

This story is exclusively for subscribers of Command Line, our newsletter about the tech industry’s inside conversation. Subscribe to a plan below for full access.

Already a Command Line subscriber?Sign in

We accept credit card, Apple Pay, and Google Pay. Having issues?Click here for FAQ

Reddit escalates its fight against AI bots

Reddit escalates its fight against AI bots

With AI eating the public web, Reddit is going on the offensive against data scraping.

Start your Command Line free trial now to continue reading

Already a Command Line subscriber?Sign in

Xbox Live is down

Netflix is starting to phase out its cheapest ad-free plan

Amazon’s last-gen Kindle Paperwhite is on sale for 50 bucks right now

Apple’s Vision Pro: five months later

Tap-to-pay could get more capable and more complicated

More from Artificial Intelligence

Perplexity’s grand theft AI

Google touts ‘enterprise-ready’ AI with more facts and less make-believe

YouTube is trying to make AI music deals with major record labels

Google Translate is getting support for more than 110 new languages

Reddit escalates its fight against AI bots

Reddit escalates its fight against AI bots

With AI eating the public web, Reddit is going on the offensive against data scraping.

Share this story

Start your Command Line free trial now to continue reading

Already a Command Line subscriber?Sign in

Xbox Live is down

Netflix is starting to phase out its cheapest ad-free plan

Amazon’s last-gen Kindle Paperwhite is on sale for 50 bucks right now

Apple’s Vision Pro: five months later

Tap-to-pay could get more capable and more complicated

More from Artificial Intelligence

Perplexity’s grand theft AI

Google touts ‘enterprise-ready’ AI with more facts and less make-believe

YouTube is trying to make AI music deals with major record labels

Google Translate is getting support for more than 110 new languages