I've tried to use a local LLM on an M4 Pro machine and it's quite painful. Not surprised t...

Hamuko • yesterday at 2:27 PM • 5 replies • view on HN

I've tried to use a local LLM on an M4 Pro machine and it's quite painful. Not surprised that people into LLMs would pay for tokens instead of trying to force their poor MacBooks to do it.

Replies

atwrk • yesterday at 2:40 PM

Local LLM inference is all about memory bandwidth, and an M4 pro only has about the same as a Strix Halo or DGX Spark. That's why the older ultras are popular with the local LLM crowd.

usagisushi • yesterday at 5:05 PM

Qwen 3.5 35B-A3B and 27B have changed the game for me. I expect we'll see something comparable to Sonnet 4.6 running locally sometime this year.

➕ show 2 replies

freeone3000 • yesterday at 2:53 PM

I’m super happy with it for embedding, image recog, and semantic video segmentation tasks.

giancarlostoro • yesterday at 2:36 PM

What are the other specs and how's your setup look? You need a minimum of 24GB of RAM for it to run 16GB or less models.

➕ show 3 replies

andoando • yesterday at 3:58 PM

Local LLMs are useful for stuff like tool calling

➕ show 1 reply

alt Hacker News

Replies