.. who is running LLMs on CPU instead of GPU or TPU/NPU

jorvi • 05/03/2025 • 4 replies • view on HN

Replies

kamranjon • 05/03/2025

Actually that's a really good question, I hadn't considered that the comparison here is just CPU vs using Metal (CPU+GPU).

To answer the question though - I think this would be used for cases where you are building an app that wants to utilize a small AI model while at the same time having the GPU free to do graphics related things, which I'm guessing is why Apple stuck these into their hardware in the first place.

Here is an interesting comparison between the two from a whisper.cpp thread - ignoring startup times - the CPU+ANE seems about on par with CPU+GPU: https://github.com/ggml-org/whisper.cpp/pull/566#issuecommen...

➕ show 1 reply

fc417fc802 • 05/04/2025

Depends on the size of the model and how much VRAM you have (and how long you're willing to wait).

yjftsjthsd-h • 05/03/2025

Not all of us own GPUs worth using. Now, among people using macs... Maybe if you had a hardware failure?

thot_experiment • 05/03/2025

[flagged]

➕ show 3 replies

alt Hacker News

Replies