logoalt Hacker News

CuriouslyCyesterday at 2:03 AM1 replyview on HN

Parameter size gets you world knowledge and better persistence of behavior as context grows. Both of those things can be engineered around to a large degree, and the latest Qwen models show that small models can be quite smart in narrow domains and short time windows.


Replies

alfiedotwtfyesterday at 6:48 AM

… maybe we should just teach models how to get their world knowledge from a local Postgres connection! Then the model can be tiny, and it can query to its little heart desires AND run on commodity hardware TODAY!