Try using DwarfStar 4 and use the --power flag: https://github.com/antirez/ds4#reducing-heat-power-usage-and...
DwarfStar is the only thing I've run that doesn't try and make my Mac Studio 128GB take off. Yes, it gets hot while doing inference but quickly cools down when idling, something I haven't experienced with Ollama, LMStudio or OMLX.
Can you run Qwen 3.6 27B on antirez/ds4 now? I thought it was all about the DeepSeek models.
DwarfStar is the only thing I've run that doesn't try and make my Mac Studio 128GB take off. Yes, it gets hot while doing inference but quickly cools down when idling, something I haven't experienced with Ollama, LMStudio or OMLX.