> A Opus 4.7/Gpt5.5 class model is 5 trillion parameters[1]. You could run it on a cluster...

zozbot234 • today at 1:07 AM • 1 reply • view on HN

> A Opus 4.7/Gpt5.5 class model is 5 trillion parameters[1].

You could run it on a cluster of nodes that each do some mix of fetching parameters from disk and caching them in RAM. Use pipeline parallelism to minimize network bandwidth requirements given the huge size. Then time to first token may be a bit slow, but sustained inference should achieve enough throughput for a single user. That's a costly setup of course, but it doesn't cost $900k.

Replies

nl • today at 1:18 AM

> You could run it on a cluster of nodes

Not sure this is a MBP either.

➕ show 1 reply

alt Hacker News

Replies