That's why he is saying it's not equivalent. For it to be the same, the LLM would have to ...

NewsaHackO • yesterday at 9:51 PM • 3 replies • view on HN

That's why he is saying it's not equivalent. For it to be the same, the LLM would have to train on/transform Minecraft's source code into its weights, then you prompt the LLM to make a game using the specifications of Minecraft solely through prompts. Of course it's copyright infringement if you just give a tool Minecraft's source code and tell it to copy it, just like it would be copyright infringement if you used a copier to copy Minecraft's source code into a new document and say you recreated Minecraft.

Replies

alpaca128 • yesterday at 11:00 PM

What if Copilot was already trained with Minecraft code in the dataset? Should be possible to test by telling the model to continue a snippet from the leaked code, the same way a news website proved their articles were used for training.

➕ show 1 reply

paxys • yesterday at 10:30 PM

Is there a legal distinction between training, post-training, fine tuning and filling up a context window?

In all of these cases an AI model is taking a copyrighted source, reading it, jumbling the bytes and storing it in its memory as vectors.

Later a query reads these vectors and outputs them in a form which may or may not be similar to the original.

➕ show 2 replies

phendrenad2 • yesterday at 11:57 PM

It's not equivalent, but it's close enough that you can't easily dismiss it.

alt Hacker News

Replies