It is lossy, but it is still enough for verbatim recreations. All of Wikipedia is just 24GB of lossl...

sigmoid10 • today at 8:52 AM • 0 replies • view on HN

It is lossy, but it is still enough for verbatim recreations. All of Wikipedia is just 24GB of lossless compressed text and all of JK Rowling's work fits into a few MB. So these things would easily be storable verbatim in trillion parameter models. Reasoning about the training cutoff is also something that the newest models do pretty well, because you can teach them to do so after pre training using e.g. SFT. With tool use it can then even check actual current sources, which may happen without you even knowing in the normal chat apps unless you use a controlled API call.

alt Hacker News