All this reactionary outrage in the comments is funny. And lame.
Yes, for the vast majority of the internet, serving traffic is near zero marginal cost. Not for LLMs though – those requests are orders of magnitude more expensive.
This isn't controversial at all, it's a well understood fact, outside of this irrationally angry thread at least. I don't know, maybe you don't understand the economic term "marginal cost", thus not understanding the limited scope of my statement.
If such DDOSes as you mention were common, such a scraping strategy would not have worked for the scraper at all. But no, they're rare edge cases, from a combination of shoddy scrapers and shoddy website implementations, including the lack of even basic throttling for expensive-to-serve resources.
The vast majority of websites handle AI traffic fine though, either because they don't have expensive to serve resources, or because they properly protect such resources from abuse.
If you're an edge case who is harmed by overly aggressive scrapers, take countermeasures. Everyone with that problem should, that's neither new nor controversial.
It's not a cost for me to scrape LLM.
It is a cost for me for LLM to scrape me.
Why should I care about costs that have when they don't care about the costs I have?
The extent of the utilization is new.
The number of bots that try to hide who they are, and don't bother to even check robots.txt is new.
"They are rare edge cases" are we on the same internet?
One euro is marginal for me for someone else it is their daily meal.
"such DDOSes as you mention were common, such a scraping strategy would not have worked for the scraper at all"
They are common. The strategy works for the llm but not for the website owner or users who can't use a site during this attack.
The majority of sites are not handling AI fine. Getting Ddosed only part of the time is not acceptable. Countermeasures like blocking huge ranges can help but also lock out legimate users.