Claiming LLMs are fair use is ridiculous bordering on ignorant or disingenuous.
Here’s the 4 part test from 17 U.S.C. § 107:
1. the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;
Fail. The use is to make trillions of dollars and be maximally disruptive.
2. the nature of the copyrighted work;
Fail. In many cases at least, the copy written code is commercial or otherwise supports livelihoods; and is the result much high skill labor with the express stipulation for reciprocity.
3. the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and
Fail. They use all of it.
4. the effect of the use upon the potential market for or value of the copyrighted work.
Fail to the extreme. There is already measurable decline in these markets. The leaders explicitly state that they want to put knowledge workers out of business.
- - -
Hell, LLMs don’t even pass the sniff test.
The only reason this stuff is being entertained is some combination of the prisoner’s dilemma and more classic greed.
You’re mixing up “using” with “copying”. You are allowed to “use” all of a book or movie or code by listening to or watching or reviewing the whole thing. Copyright protects copies. The legal claim here is than training an LLM is sufficiently transformative such that it cannot be construed as a copy.
> Fail. The use is to make trillions of dollars and be maximally disruptive.
Fair use has repeatedly been found even in cases where the copies were used for commercial purposes. See Sony v. Connectix for example, where the cloning and disassembly of the PlayStation BIOS for the purposes of making a commercially sold (at retail, in a box) emulator of a then currently sold game console was determined to be fair use.
> Fail. In many cases at least, the copy written code is commercial or otherwise supports livelihoods; and is the result much high skill labor with the express stipulation for reciprocity.
Again, see Sony V. Connectix where the sales of PlayStation consoles support the livelihoods and skilled labor of Sony engineers.
> Fail. They use all of it.
And again, see Sony V. Connectix, where the entire BIOS was copied again and again until a clone could be written that sought to reproduce all the functionality of the real BIOS. Or see Google V. Oracle where cloning the entire Java API for a competing commercial product was also deemed fair use. Or the Google Books lawsuits, where cloning entire books for the purposes of making them searchable online was deemed fair use. Or see any of the various time/format shifting cases over the years (Cassette tapes, VCRs, DVRs, MP3 encoders, DVD ripping etc) where making whole and complete copies of works is deemed fair use.
> Fail to the extreme. There is already measurable decline in these markets. The leaders explicitly state that they want to put knowledge workers out of business.
Again, see Sony v. Connectix where the commercial product deemed to be fair use was directly competing with an actively sold video game console. Copyright protects the rights of creators to exploit their own works, it does not protect them against any and all forms of competition.
Or perhaps instead of referring you to the history of legislation around copyright in the digital age, I should instead simply point you at Judge Alsup's ruling in the Bartz case where he details exactly why the facts of the case and prior case law find that training an AI on copyrighted material is fair use [1]. Of particular interest to you might be the fact that each of the 4 factors is not a simple "pass/fail" metric, but a weighing of relative merits. For example, when examining factor 1, Judge Alsup writes:
> That the accused is a commercial entity is indicative, not dispositive. That
> the accused stands to benefit is likewise indicative. But what matters most
> is whether the format change exploits anything the Copyright Act reserves to
> the copyright owner.
[1]: https://admin.bakerlaw.com/wp-content/uploads/2025/07/ECF-23...
These are factors to be considered, not pass/fail questions.
This comment highlights a basic dilemma about how and where to spend your time.
Here's a basic rule of thumb I recommend people apply when it comes to these sorts of long, contentious threads where you know that not every person showing up to the conversation is limiting themselves to commenting about things they understand and that involve some of the most tortured motivated reasoning about legal topics:
If the topic is copyright and someone who is speaking authoritatively has just used the words "copy written", then ignore them. Consider whether you need to be anywhere in the conversation at all, even as a purely passive observer. Think about all the things you can do instead of wasting your time here, where the stakes for participation are so low because nothing that is said here really matters. Go do something productive.