Is there any indication of what compute resources this will actually require (in its various incarna...

mgc8 • yesterday at 8:53 PM • 2 replies • view on HN

Is there any indication of what compute resources this will actually require (in its various incarnations)? Does it incorporate any of the optimisations pioneered by Google (such as TurboQuant, MTP) or some other original innovations to make the frontier quality realistically available to local users?

Replies

wgd • yesterday at 10:39 PM

The GLM-5 series is 744B-A40B. This is not a local model for any reasonable definition of local, but it's an open model which means (once they upload the weights in a week or so) there will be a dozen third-party inference providers competing on price per token.

➕ show 2 replies

dakolli • today at 1:59 AM

If you have 80k in hardware you can run it.. There is not such thing as an effective local model that runs on consumer hardware, anybody telling you otherwise is lying, delusional. JuSt a FeW MoRe ReLeAsEs

➕ show 1 reply

alt Hacker News

Replies