That’s 24GB VRAM. Not enough to run a 27B model at a useful quant+context size.
Yeah seems to me like the mac studios with the unified memory architecture are genuinely good bang for the buck at the moment, because of this memory size consideration?
You can run 8bit 27B models at 24GB, it's definitely enough for the model size.
So buy two.
I beg to differ. Have a look at this repo with single/double 3090 optimized configs for Qwen and Gema models: https://github.com/noonghunna/club-3090