logoalt Hacker News

aetherspawntoday at 2:31 AM1 replyview on HN

I wish they would release the requirements to run on llama.cpp with any announcements of open models.

A bonus would be tok/s on common hardware.


Replies

lcampbelltoday at 2:59 AM

I don't think llama.cpp supports any of the LongCat models, actually.

They haven't posted weights/inference solutions for LongCat-2.0 [1], but LongCat-Next had transformers support, which I assume means it works with vLLM/SGLang.

Given it's 1.6T, "common hardware" is probably out of the question; even 2bpw is going to measure out at 400GB, even before considering the bandwidth requirements for 48B active. I haven't read the LongCat-2.0 architecture docs, but if you're not running GLM-5.2, you're probably not running this either.

[1] https://huggingface.co/meituan-longcat/LongCat-2.0: "Model weights coming soon — stay tuned!"

show 2 replies