logoalt Hacker News

jwrtoday at 5:32 PM2 repliesview on HN

An interesting and sad aspect of the war on bots and scraping that is being waged is that we are hurting ourselves in the process, too. Many tasks I'm trying to get my AI assistant to do cannot be done quickly, because sites defensively prohibit access to their content. I'm not scraping: it's my agent trying to fetch a page or two to perform a task for me (such as check pricing or availability).

We need a better solution.


Replies

johnethtoday at 8:14 PM

I would assume most sites that block access to your AI assistant do so because they want to show a human ads, i.e. not run at a loss. Seems reasonable.

bee_ridertoday at 6:01 PM

You aren’t scraping for the sake of training a model, but scraping the prices and availability is still scraping, right?

I think some of the folks running sites would rather have you go to the site and view the items “suggested based on your shopping history” (I consider these ads, the vendors might disagree), etc.

I’m more sympathetic to the people running sites than the LLM training scrapers, but these are two parties in a many-party game and neither one is perfectly aligned with users.