I'm not sure how llms count as fair use. It's just that we can't show HOW they've been encoded in the model, means it's fair use? Or that statistical representations are fair use? Or is it the generation aspect? I can't sell you a Harry potter book, but I can sell you some service that let's you generate it yourself?
I feel like this has really blown a hole in copyright.
That’s one hell of a headline for a story about Meta winning summary judgement for most of the claims against them. You’d be forgiven for thinking Meta lost this case, going by the headline.
Surely it should be a whole separate copyright case with fines of up to $150,000 per work infringed.
The system wants to destroy creativity and humans. The system wants artists and writers to depend on the patronage of the ultra rich who steal their output.
We are back to feudal society, except that monarchs or their advisers at least had taste as opposed to the Nouveau riche.
Feels more of political purpose as AI is now a "national strategy".
Wait. So I can pirate any stuff as long as I intend to train an LLM with it?
Sorta not really. They said the plaintiff had a non relevant argument or something.
So the argument is that by torrenting ebooks, meta provided bandwidth to the torrent network, and thus provided (financial??!) benefit to pirate sites?
I got to be honest, that sounds extremely weak to me. The benefit to the pirate site of joining the torrent swam seems like it would be extremely slight.
[dead]
Just read (most) of the ruling.
The ruling is fine. The judge is not Alsop but he’s not technically incompetent either, which is good.
The torrent comments in general are nothing to get het up about; in summary
1) Meta wanted to download but not upload libgen and Anna’s after they couldn’t find anyone with rights to license that would talk to them.
2) they didn't want to distribute; just download. An engineer put in evidence that they restricted seeding successfully.
3) late in the case Silverman et al claimed while they hadnt been seeding they had been leeching and that counts as distribution (?!)
Judge commented as follows
1. just downloading is probably fine because it could be for purposes of fair use, and fair use concerns generally trump even good faith and fair dealing
2. Nobody could get llama to spit out more than a 60 token quote from a plaintiff book; thus llama is not made for infringement
3. We will need more briefing on this leeching thing which it is alleged is a form of distribution.
The judge lays out what he thinks a workable claim to get to the supreme court would be, which is that these llms defeat the purpose of our copyright laws by reducing the amount of human creativity and expression available to those who want to create economic value through creativity. Eg where will the jobs for biographers go?
I will say that debate is an active topic worldwide right now and a good question, with answers ranging from: “this maximizes human creativity bro” to “laser printers disrupted lead type foundries, that was great” to “nobody will ever write again and we are murdering our creative class and burning down their craftsman mid century modern homes.”
It seems to me this will get taken up next session with SCOTUS but also that it’s a little early; we just don’t know where this is going exactly. Either way, I expect our current judge will learn that leeching is precisely NOT seeding once the defense legal team has time to brief him.