logoalt Hacker News

joefourieryesterday at 7:04 PM2 repliesview on HN

Who is going to buy a $4299 M5 Max MBP with 64GB of RAM just to run Gemma 4 31b? Firstly you don't need 64GB for that model. Secondly if you want a machine that sits in the corner and does nothing but LLM inference, you don't buy a MacBook Pro, you buy some GPUs which are going to cost you a fraction of that (~$1k for ~64GB of VRAM is possible). The people buying Apple Silicon for inference general aim for the Mac Studios with enormous amounts of RAM (128-512GB), to run very large models.

The idea is obviously to be running the LLM on your work laptop. As a developer I'd need a laptop with 24GB of RAM for work anyway, and 48GB, which is enough for a very good quant of Gemini, is just $400 extra.


Replies

zozbot234yesterday at 10:55 PM

> Firstly you don't need 64GB for that model.

You might need that to run it with a longer context, KV cache size is a known issue with that model series.

vardumpyesterday at 9:14 PM

> Gemma 4 31b? Firstly you don't need 64GB for that model.

You don't? It for sure doesn't run on my 32 GB M2 MAX.

show 1 reply