What if Copilot was already trained with Minecraft code in the dataset? Should be possible to test b...

alpaca128 • yesterday at 11:00 PM • 1 reply • view on HN

What if Copilot was already trained with Minecraft code in the dataset? Should be possible to test by telling the model to continue a snippet from the leaked code, the same way a news website proved their articles were used for training.

Replies

NewsaHackO • yesterday at 11:49 PM

I feel as though the fact that you are asking a valid question shows how transformative it is; clearly, while the LLM gets a general ability to code from its training corpus, the data gets so transformed that it's difficult to tell what exactly it was trained on except a large body of code.

➕ show 2 replies

alt Hacker News

Replies