It's because they want to restrict AI companies from stealing content, but they can't do i...

Gigachad • today at 1:38 AM • 3 replies • view on HN

It's because they want to restrict AI companies from stealing content, but they can't do it if internet archive proxies it all for them.

All of the LLMs would be massively less useful if it wasn't for scraping the latest news.

Replies

stephen_g • today at 1:47 AM

LLMs have other ways of accessing the content, they don’t need the Web Archive.

Every LLM company can afford to spin up a new subscriber account every day, proxying to appear different IPs from all sorts of ASNs, do some crawling until the account gets banned, and then do it again, and again, and again.

➕ show 2 replies

switzer • today at 10:07 AM

LLMs would then license content from news orgs and other publishers, which is what should happen.

userbinator • today at 3:40 AM

"stealing" is BS because the original still exists. Copyright infringement is more correct.

➕ show 2 replies

alt Hacker News

Replies