logoalt Hacker News

Tadpole9181yesterday at 1:27 PM2 repliesview on HN

So GitHub and Windows and IDEs need to be open source because they can output FOSS code? That's obviously rediculous.

If an AI outputs copyrighted code, that is a copyright violation. And if it does and a human uses it, then you are welcome to sue the human or LLM provider for that. But you don't get to sue people for perceived "latent" thought crimes.


Replies

vova_hn2yesterday at 1:47 PM

First of all, I'm not advocating for this claim, I'm merely trying to clarify what other people say.

That being said, I don't think that your analogy is valid in this case.

> GitHub and Windows and IDEs need to be open source because they can output FOSS code

They can output FOSS code, but they themselves are not derived from FOSS code.

It can be argued that the weights of a model is derived from training data, because they contain something from the training data (hard to say what exactly: knowledge, ideas, patterns?)

It can also be argued that output is derived from weights.

If we accept both of those claims, then GPL training data -> GPL weighs -> every output is GPL

> If an AI outputs copyrighted code

Again, the issue is not what exactly does AI output, but where it comes from.

eruyesterday at 2:23 PM

It would be relatively easy to scan the output of the LLM for copyrighted material, before handing it to the user.

(I say 'relatively easy'. Not that it would be trivial.)