logoalt Hacker News

paxysyesterday at 10:30 PM2 repliesview on HN

Is there a legal distinction between training, post-training, fine tuning and filling up a context window?

In all of these cases an AI model is taking a copyrighted source, reading it, jumbling the bytes and storing it in its memory as vectors.

Later a query reads these vectors and outputs them in a form which may or may not be similar to the original.


Replies

SatvikBeriyesterday at 10:40 PM

Judges have previously ruled that training counts as sufficiently transformative to qualify for fair use: https://www.whitecase.com/insight-alert/two-california-distr...

I don't know of any rulings on the context window, but it's certainly possible judges would rule that would not qualify as transformative.

show 2 replies
derangedHorsetoday at 2:38 AM

The context window is quite literally not a transformation of tokens or a "jumbling of bytes," it's the exact tokens themselves. The context actually needs to get passed in on every request but it's abstracted from most LLM users by the chat interface.