logoalt Hacker News

avazhitoday at 5:17 AM0 repliesview on HN

I think the main issue is, as the other guy also alluded to, the parameter discrepancy. I know Mixture of Experts models are popular specifically becaue they save a lot of space and memory, but if your initial answer space is two orders of magnitude smaller on a local machine compared to the frontier cloud models, that knowledge gap just gets wider as the conversation continues, and the initial answer isn't even going to be as good to begin with. I don't know how to solve that parameter gap without hardware - there's only so much optimisation you can do, but at the end of the day parameterised knowledge takes up some minimum amount of bits that you can't excise without the actual knowledge and intelligence suffering.