logoalt Hacker News

Matlyesterday at 7:04 PM1 replyview on HN

> I'm not really sure why you say it isn't an attack or adversarial. One side is doing something the other side doesn't want. Seems to be pretty clearly adversarial.

This is Anthropic we're talking about here. A company that's infamous for adversarial scraping of copyrighted content. I generally don't accept their framing, especially when it's pretty clear what the end goal of that is.


Replies

wrsh07today at 12:31 AM

This is a bit of a non sequitur.

The most charitable read i can get is:

> Theft presupposes a legitimate possessory claim by the victim. If A’s possession of X is itself wrongful because A stole X from B, then when C takes X from A, C has not violated A’s rightful ownership of X—because A has none.

I think the whole thing is a bit fraught. In the best case, all frontier model companies would have invested in a giant expansion of Wikipedia and thus distillation would be stealing because the base information is already public and available. Obviously that's not what happened.

However, at this point, I suspect the stolen books (and scraped websites) are largely a footnote of training. Something that was essential to create early models, but relatively minor given the work expended since to create new content and RL environments

show 1 reply