I’m keen to understand speed here etc etc. if I bought a Mac studio with 96GB - what can I realistically run, how’s it compare to fable/opus etc and how fast is it?
Currently maxing out two Claude code accounts every x hours when working on large code migrations or setting up new iOS apps etc - most of time it’s fine but occasionally it’s mega frustrating!
I think currently you can only get the M3 Ultra Studio with 96gb, and for coding tasks, say you rub Qwen Coder on it (which doesn't need that much ram), it's not the fastest, something like 30-40 tok/sec. Probably better with a MacBook Pro with the M5 chip. There is a website for comparing different configurations and models: https://llmcheck.net/benchmarks
[dead]
I strongly recommend trying LM Studio - it's the lowest friction way to try out models, you can browse https://lmstudio.ai/models and click "Get" and then "Run in LM Studio" to download and run a model.
With 96GB I'd start with the Gemma 4 and Qwen 3.6 models. Any of those should work fine.