logoalt Hacker News

kentonvtoday at 6:21 PM0 repliesview on HN

Cloudflare crawl respects robots.txt. It does not attempt to bypass any anti-crawling measures. If the site doesn't want to be crawled -- whether it uses Cloudflare or not -- this product will not help you crawl it.

Some sites actually want crawlers -- e.g. sites that are selling a product, documentation, etc. That's what this product is meant for.

https://x.com/CloudflareDev/status/2031745285517455615

(Disclosure: I work for Cloudflare but not on this product.)