logoalt Hacker News

littlestymaarlast Thursday at 7:00 PM5 repliesview on HN

It's not even clear you can license language model weight though.

I'm not a lawyer but the analysis I've read had a pretty strong argument that there's no human creativity involved in the training, which is an entirely automatic process, and as such it cannot be copyrighted in any way (the same way you cannot put a license on a software artifact just because you compiled it yourself, you must have copyright ownership on the source code you're compiling).


Replies

skissanelast Thursday at 7:18 PM

IANAL either but the answer likely depends on the jurisdiction

US standards for copyrightability require human creativity and model weights likely don’t have the right kind of human creativity in them to be copyrightable in the US. No court to my knowledge has ruled on the question as yet, but that’s the US Copyright Office’s official stance.

By contrast, standards for copyrightability in the UK are a lot weaker than-and so no court has ruled on the issue in the UK yet either, it seems likely a UK court would hold model weights to be copyrightable

So from Google/Meta/etc’s viewpoint, asserting copyright makes sense, since even if the assertion isn’t legally valid in the US, it likely is in the UK - and not just the UK, many other major economies too. Australia, Canada, Ireland, New Zealand tend to follow UK courts on copyright law not US courts. And many EU countries are closer to the UK than the US on this as well, not necessarily because they follow the UK, often because they’ve reached a similar position based on their own legal traditions

Finally: don’t be surprised if Congress steps in and tries to legislate model weights as copyrightable in the US too, or grants them some sui generis form of legal protection which is legally distinct from copyright but similar to it-I can already hear the lobbyist argument, “US AI industry risks falling behind Europe because copyrightability of AI models in the US is legally uncertain and that legal uncertainty is discouraging investment”-I’m sceptical that is actually true, but something doesn’t have to be true for lobbyists to convince Congress that it is

show 3 replies
dragonwriterlast Friday at 1:32 AM

> It's not even clear you can license language model weight though.

It is clear you can license (give people permissions to) model weights, it is less clear that there is any law protecting them such that they need a license, but since there is always a risk of suit and subsequent loss in the absence of clarity, licenses are at least beneficial in reducing that risk.

AlanYxlast Thursday at 7:19 PM

That's one of the reasons why they gate Gemini Nano with the "Gemini Nano Program Additional Terms of Service". Even if copyright doesn't subsist in the weights or if using them would be fair use, they still have recourse in breach of contract.

show 2 replies
km3rlast Thursday at 7:24 PM

Why not? Training isn't just "data in/data out". The process for training is continuously tweaked and adjusted. With many of those adjustments being specific to the type of model you are trying to output.

show 1 reply
IncreasePostslast Thursday at 7:31 PM

Doesn't that imply just the training process isn't copyrightable? But weights aren't just training, they're also your source data. And if the training set shows originality in selection, coordination, or arrangement, isn't that copyrightable? So why wouldn't the weights also be copyrightable?

show 3 replies