Doesn't work for pages protected by cloudflare in my experience. What a shame, they could've produced the problem and sold the solution.
That's too funny. If true, really looking forward to the Cloudflare response here. I'm unsure how you would spin that in a way that didn't seem self-serving.
Cloudflare crawl respects robots.txt. It does not attempt to bypass any anti-crawling measures. If the site doesn't want to be crawled -- whether it uses Cloudflare or not -- this product will not help you crawl it.
Some sites actually want crawlers -- e.g. sites that are selling a product, documentation, etc. That's what this product is meant for.
https://x.com/CloudflareDev/status/2031745285517455615
(Disclosure: I work for Cloudflare but not on this product.)
I imagine that would cause a backlash from the website owners trusting cloudflare to keep their content 'safe'
Wait. What?
Is this just a way to strong-arm non-cloudflarians into adopting their platform if you don't want your site crawled? It does sound like they are selling the solution to avoid their own content crawler.
As long at it gets past Azure's bot protection ...
Came here to write this. I am getting much better results from Firecrawl (not affiliated with them, just a happy customer).
Please tells me you are joking
That’s what they are doing. This is a textbook protection racket.
“Buy Cloudflare bot protection, otherwise it would be a shame if your site got scraped and ddos’d.”
Who is doing the scraping and ddosing? Cloudflare.