At some point we have to be running into some inherent mathematical limits of knowledge compression,...

2001zhaozhao • today at 6:18 AM • 1 reply • view on HN

At some point we have to be running into some inherent mathematical limits of knowledge compression, right? No way the knowledge benchmarks on these 8B models will keep getting better without overfitting on these benchmarks

Replies

yorwba • today at 6:51 AM

If you give the model access to specialized tools (e.g. web search for question answering) the knowledge doesn't have to be stored in the model weights, which leaves some room for improvement. You'd still be overfitting to benchmarks (since different tasks might require different tools) but not necessarily to specific benchmark questions, so within-domain generalization could be quite good.

As an example for a similar approach, Teapot AI has trained very small models https://teapotai.com/models to only answer questions where the answer can be found within the context window, and although not perfect, they do quite well at this compared to larger, more general models.

➕ show 1 reply

alt Hacker News

Replies