logoalt Hacker News

SphericalCowwwtoday at 2:15 PM1 replyview on HN

I mean, since GPT-4, I believe the RAM is no longer creating the miracle that the LLM performance scales directly with the model size. At least ChatGPT itself convinced me that any decent-sized company can create a GPT4 equivalent in terms of model size, but limited by service options, like memory cache and hallucination handling. Companies buy RAM simply to ride the stock hype.

I am no expert, so this is a shallow take, but I think the global LLM already reaches its limit, and general AGI could only be possible if it's living in the moment, i.e., retraining every minute or so, and associating it with a much smaller device that can observe the surroundings, like a robot or such.

Instead of KV cache, I have an idea of using LoRA's instead: having a central LLM unchanged by learning, surrounded by a dozen or thousands of LoRAs, made orthogonal to each other, each competed by weights to be trained every 1 min say. The LLM, since it's a RNN anyway, provides "summarize what your state and goal is at this moment" and trains the LoRAs with the summary along with all the observations and say inputs from the users. The output of the LoRAs feeds back to the LLM for it to decide the weights for further LoRAs training.

Anyways, I am just thinking there needs to be a structure change of some kind.


Replies

redanddeadtoday at 4:23 PM

share it on gh and make a show hn post about it, maybe you're right

the models are still very stupid atm something needs to change