Curious if you tested llama.cpp and still found oMLX faster? I haven't tried the latter myself,...

regexorcist • today at 5:03 PM • 1 reply • view on HN

Curious if you tested llama.cpp and still found oMLX faster? I haven't tried the latter myself, might give it a go.

egorfine • today at 5:12 PM

Oh yeah I did test various solutions and different settings and quants

Llama is about 1/3 slower on Apple Silicon.

alt Hacker News