> my infographic would be an original work. > AI training follows the same principle. If y...

jacquesm • yesterday at 12:54 PM • 1 reply • view on HN

> my infographic would be an original work.

> AI training follows the same principle.

If you really believe that then we can't have a meaningful conversation about this, that's not even ELIF territory, that's just disconnected. You should be asking questions, not telling people how it works.

Replies

ndriscoll • yesterday at 2:07 PM

How exactly is it different? All the model itself is is a probability distribution for next token given input, fitted to a giant corpus. i.e. a description of statistical properties. On its own it doesn't even "do" anything, but even if you wrap that in a text generator and feed it literal gcc source code fragments as input context, it will quickly diverge. Because it's not a copy of gcc. It doesn't contain a copy of gcc. It's a description of what language is common in code in general.

In fact we could make this concrete: use the model as the prediction stage in a compressor, and compress gcc with it. The residual is the extent to which it doesn't contain gcc.

➕ show 2 replies

alt Hacker News

Replies