Is there a performance benefit for inference speed on M-series MacBooks, or is the primary task here...

cwoolfe • yesterday at 5:46 PM • 1 reply • view on HN

Is there a performance benefit for inference speed on M-series MacBooks, or is the primary task here simply to get inference working on other platforms (like iOS)? If there is a performance benefit, it would be great to see tokens/s of this vs. Ollama.

Replies

SparkyMcUnicorn • today at 12:43 AM

See my other comment for results.

mlx is much faster, but anemll appeared to use only 500MB of memory compared to the 8GB mlx used.

alt Hacker News

Replies