logoalt Hacker News

admax88qqqyesterday at 6:53 PM3 repliesview on HN

> One day, maybe not far from now, a breakthrough will allow huge LLMs (say 200B in size) to run well on an old 5 year old Dell desktop.

But if you have such a breakthrough could you not also apply it and run 200T models on todays datacenters?


Replies

pennomiyesterday at 7:07 PM

That assumes scaling laws still hold up. A bigger model might end up only incrementally more intelligent.

show 1 reply
ACCount37yesterday at 8:15 PM

Not only you could: you would also want to.

The likes of Mythos show that the scaling laws are real, and you can x5/x2 the total/active params and get meaningful gains. If "inference per param" gets cheaper? Up the params and get more intelligence for the same price.

deweywsuyesterday at 7:00 PM

Quite true