> Anthropic's IP was created by harvesting and "distilling" other people's IP. Copyrighted materials, and the commons... which they have essentially privatized.
Anthropic and others argue that because LLMs don’t output full copyrighted works word for word - hence their LLMs aren’t infringing on copyright laws.
I think (if this ever comes to that) Chinese lab should use same arguments against Anthropic.
UPDATE: this is slight hyperbole of course, not worth arguing what they actually said. The point is intent and the facts - "The Big LLMs" "distilled" collective knowledge including copyrighted works at unimaginable scale, but it's all kosher and totally not piracy/copyright infringement. Though if you're teenager torrenting an mp3 - you'll get screwed.
Isn’t the output of LLMs completely copyright-free in the US?
> Anthropic and others argue that because LLMs don’t output full copyrighted works word for word - hence their LLMs aren’t infringing on copyright laws.
That surely can't be what they argue, because I'm sure I can't translate a copyrighted book into a different language and say "that's fine, it's not word-for-word".
> LLMs don’t output full copyrighted works word for word
Apparently they do, as per the evidence in the NYT vs OpenAI suit.