> ... on my Macbook Max M5 128 GB
Local development for who? How many of y'all are rocking 128GB of memory? Am I reading Apple's site correctly that it's a $10,000 laptop?
A 27B model can fit easily on a 32GB VRAM card (e.g. 5090) or a 32GB computer in RAM at FP8/Q8 (unsloth have 28.6GB Q8 files).
For 24GB VRAM cards (e.g. 4090) you can use Q6_K (22.5GB) or Q5_K_M (19.5GB) quants, possibly offloading some of the weights to RAM.
I'm on 128GB ram strix halo, bought framework desktop for a few thousand CAD back when everyone was calling framework desktop overpriced
It wasn't $10k a month ago
I work with a lot of 3D graphics and geo stuff so I can hit the ceiling with my 48 GB mac. It's not all LLM work. I prioritized more storage than RAM with my budget. Being able to run local llms has greatly helped me understand how they work. For day to day dev I pay for Gemini or Claude.
Think commercial. My company invested in a local rig since privacy is important to our customers and sometimes I want to use these models on private data.
Qwen3.6 runs great on GPU with 24GB VRAM. You could get used 3090 for it.
You don't need nearly that much RAM to run Qwen 3.6 27B, though. qwen3.6:27b-q4_K_M is only 17GB, for example.