Most of the major AI agents still follow robots.txt, according to Dark Visitors founder Gavin King. “They’re pretty consistent,” he says. But not all website owners have the time or knowledge to regularly update their robots.txt files. And even if they do, some bots ignore the file’s instructions. “They try to fake their traffic.”
Prince says Cloudflare’s bot blocking isn’t a mandate that these kinds of bad actors can ignore. “Robots.txt is like putting up a ‘No Entry’ sign,” he says. “It’s like having armed guards patrolling a physical wall.” As well as flagging up other kinds of suspicious web activity, such as price-scraping bots used for illegal price monitoring, the company has created a process to find even the most carefully hidden AI crawlers.
Cloudflare also announced it will soon open a marketplace for customers to negotiate scraping terms with AI companies, including paying a fee for the use of their content or exchanging credits to use AI services in exchange for scraping. “We’re not particularly concerned with the transaction, but we think there needs to be a way to give value back to the original content creators,” Prince said. “The compensation doesn’t have to be monetary. The compensation could be credits or recognition. It could be a variety of things.”
No date has been set for the marketplace’s launch, but even if it does launch later this year, it would join an increasingly crowded field of projects aimed at facilitating licensing agreements and other permission arrangements between AI companies, publishers, platforms and other websites.
How are AI companies taking this? “We’ve spoken to most of them, and their reactions have ranged from, ‘This makes sense, we’ll accept it,’ to, ‘Go to hell,'” says Prince (who declined to name names).
The project moved forward fairly quickly. Prince says the impetus for the project came from a conversation he had with Atlantic CEO (and former WIRED editor-in-chief) Nick Thompson, who told him that many publishers had encountered rogue web scrapers. “It’s great that he’s doing it,” Thompson said. If even major media organizations were struggling to deal with the influx of scrapers, independent bloggers and website owners would have a harder time, Prince reasoned.
Cloudflare has long been a leader in web security, providing much of the infrastructure that powers the web. The company has historically tried to remain as neutral as possible about the content of the websites it serves. While it has made rare exceptions to that rule, Prince has stressed that Cloudflare does not want to be the arbiter of what is allowed online.
Here, he thinks Cloudflare is uniquely positioned: “The path we’re on is not sustainable,” Prince says. “Hopefully, we can be part of getting humans paid for their work.”