This is a bit of a non sequitur.
The most charitable read i can get is:
> Theft presupposes a legitimate possessory claim by the victim. If A’s possession of X is itself wrongful because A stole X from B, then when C takes X from A, C has not violated A’s rightful ownership of X—because A has none.
I think the whole thing is a bit fraught. In the best case, all frontier model companies would have invested in a giant expansion of Wikipedia and thus distillation would be stealing because the base information is already public and available. Obviously that's not what happened.
However, at this point, I suspect the stolen books (and scraped websites) are largely a footnote of training. Something that was essential to create early models, but relatively minor given the work expended since to create new content and RL environments
I think you misunderstood my point. Distillation is not stealing. It's not that
"Theft presupposes a legitimate possessory claim by the victim. If A’s possession of X is itself wrongful because A stole X from B, then when C takes X from A, C has not violated A’s rightful ownership of X—because A has none."
Anthropic has no standing in claiming that using its model output is 'stealing' because that would basically eviscerate their business model where they claim that enterprises can use its model output and still claim the generated code as their IP.