If you have a 256 GB or 512 GB Mac Studio, the real game is to run multiple sessions in parallel in order to make the best use of your limited memory bandwidth. You'd have plenty of excess RAM for that given how small the KV cache is even at max context.