logoalt Hacker News

trollbridgeyesterday at 7:45 PM1 replyview on HN

And whilst the IA will honour requests not to archive/index, more aggressive scrapers won't, and will disguise their traffic as normal human browser traffic.

So we're basically decided we only want bad actors to be able to scrape, archive, and index.


Replies

JumpCrisscrossyesterday at 8:46 PM

> we're basically decided we only want bad actors to be able to scrape, archive, and index

AI training will be hard to police. But a lot of these sites inject ads in exchange for paywall circumvention. Just scanning Reddit for the newest archive.is or whatever should cut off most of the traffic.