prismML provides a llama.cpp fork which is compatible with the 1 bit models:

m0do1 • today at 7:54 PM • 0 replies • view on HN

After fails with Ollama and main llama.cpp the fork worked on my M5 MBA.

Edit: Typos

alt Hacker News