logoalt Hacker News

cwoolfeyesterday at 5:46 PM1 replyview on HN

Is there a performance benefit for inference speed on M-series MacBooks, or is the primary task here simply to get inference working on other platforms (like iOS)? If there is a performance benefit, it would be great to see tokens/s of this vs. Ollama.


Replies

SparkyMcUnicorntoday at 12:43 AM

See my other comment for results.

mlx is much faster, but anemll appeared to use only 500MB of memory compared to the 8GB mlx used.