← Back to all tweets

Tweet by @cramforce

View original on X

People talk about Cloudflare blocking AI crawlers. Some nuance: This is something we considered shipping at Vercel but ultimately decided against. The TLdr is that because of pesky game-theory and system analysis stuff, this type of marketplace will not work OR it will work, but have bad outcomes. Obviously, both options are bad. The current state: Cloudflare is aiming to create a marketplace where AI crawlers pay for the content the crawl. This has not actually shipped. As a basis for it, Cloudflare started blocking AI crawlers to access content on Cloudflare's CDN under some circumstances. The 1st bit of nuance: Almost nobody wants this as a percentage of website. If you sell goods or services, then you want to be crawled by AI and get free advertising from the AI. Things are, of course, different for media sites. If your content is the product, or, dare I say, the user reading the content is the product, then AI is acting as a substitute. You don't want it to get access to all of your content and be able to create substitute content. Because this is real, we shipped an *opt-in* feature for such sites of opt out of AI crawling. For the Cloudflare marketplace to be successful, there has to be content on it that the AI crawlers want so badly that they are willing to pay for it. AI crawlers want two categories of stuff - Absolutely everything (quantity matters) - The best, most unique stuff Hence a marketplace must be both large (has a good chunk of everything) and high quality (has the good stuff). Cloudflare is a big enough chunk of the internet to be "large", but it has to turn the feature on by default for the chunk to have a chance to be big enough. If the feature was opt-in, not enough people would opt-in because, well, see above, most sites don't want this. Even if you somehow make it so that enough sites opt-in, then some are always incentivized to opt-out, because while the AI might take a lot of traffic, if that one site is the only one to link to, that would be very valuable. So, in practice even media sites opt for AI crawlers. This is classic tragedy of the commons. Next, the best, most unique stuff. It won't be on the marketplace. Why? For the same reason why there isn't Netflix-for-news. The stuff is expensive to produce, but a substitute of each other–and the most valuable players go it alone (See Reddit licensing directly to Google, and the NYT suing for what might eventually settle to a direct deal). Next, let's assume everything I said above is false, and the marketplace has the right content that it *could* work. It will still fail. That is because the content is subject to the DRM problem: A single digital copy that leaves the crawler-wall is enough to circumvent the entire scheme. Did y'all notice that all the AI companies are shipping browsers? If you give your content to a "human", the AI crawler can get a copy without doing direct crawling. And finally: AI agents are user agents. It's a real problem that the ad based business model of the web is under threat. But there is also real value in AI-supported content consumption. We need to find a way to make it work, not break the new stuff to keep the old business model, that was already struggling, on life support.

Garry Tan
Garry Tan
@garrytan

CloudFlare blocked all AI on *.ycombinator.com without our permission or even notifying us Perhaps this was a bug?

181
Reply