logoalt Hacker News

janandonlytoday at 9:37 AM1 replyview on HN

> Please make sure you have a Mac with more than 32GB of unified memory.

Yeah, I can still save money by buying a cheaper device with less RAM and just paying my PPQ.AI or OpenRouter.com fees .


Replies

zozbot234today at 9:47 AM

> Please make sure you have a Mac with more than 32GB of unified memory.

The lack of proper support for SSD offload (via mmap or otherwise) is really the worst part about this. There's no underlying reason why a 3B-active model shouldn't be able to run, however slowly, on a cheap 8GB MacBook Neo with active weights being streamed in from SSD and cached. (This seems to be in the works for GGML/GGUF as part of upgrading to newer upstream versions; no idea whether MLX inference can also support this easily.)