But where are you going to find an Nvidia GPU with 128+ GB of memory at an enthusiast-compatible price?
That might even be true, but how large is the TAM for such machines?
Some Chinese sources sell modded Nvidia GPUs with extra VRAM. They're quite affordable in comparison to even a Mac Pro.
and let alone competing on the energy consumption!
The Nvidia DGX Spark is exactly this and in the same price and performance bracket.
You can still buy used 3090 cards on ebay. 5 of them will give you 120GB of memory and will blow away any mac in terms of performance on LLM workloads. They have gone up in price lately and are now about $1100 each, but at one point they were $700-800 each.
Where are you gonna find Apple hardware with 128GB of memory at enthusiast-compatible price?
The cheapest Apple desktop with 128GB of memory shows up as costing $3499 for me, which isn't very "enthusiast-compatible", it's about 3x the minimum salary in my country!
You don’t need it if you use llamacpp on Windows, or if you compile it on Linux with CUDA 13 and the correct kernel HMM support, and you’re only using MoE models (which, tbh, you should be doing anyways).