> One day, maybe not far from now, a breakthrough will allow huge LLMs (say 200B in size) to run ...

admax88qqq • yesterday at 6:53 PM • 3 replies • view on HN

> One day, maybe not far from now, a breakthrough will allow huge LLMs (say 200B in size) to run well on an old 5 year old Dell desktop.

But if you have such a breakthrough could you not also apply it and run 200T models on todays datacenters?

Replies

pennomi • yesterday at 7:07 PM

That assumes scaling laws still hold up. A bigger model might end up only incrementally more intelligent.

➕ show 1 reply

ACCount37 • yesterday at 8:15 PM

Not only you could: you would also want to.

The likes of Mythos show that the scaling laws are real, and you can x5/x2 the total/active params and get meaningful gains. If "inference per param" gets cheaper? Up the params and get more intelligence for the same price.

deweywsu • yesterday at 7:00 PM

Quite true

alt Hacker News

Replies