like someone said above: brew install llama.cpp llama-server -hf ggml-org/gemma-4-E4B-it-GGUF...

homarp • today at 7:58 AM • 1 reply • view on HN

like someone said above: brew install llama.cpp

llama-server -hf ggml-org/gemma-4-E4B-it-GGUF --port 8000 (with MCP support and web chat interface)

and you have OpenAI API on the same 8000 port. (https://github.com/ggml-org/llama.cpp/tree/master/tools/serv... lists the endpoints)

Replies

AndroTux • today at 1:29 PM

And why do I use ggml-org/gemma-4-E4B-it-GGUF instead of one of the 162 other models that can be found under the ggml-org namespace? And how do I even know that this is the namespace to look at?

That's what I meant by model management. I'm too tired to scroll through a bazillion models that all have very cryptic names and abbreviations just to find the one that works well on my system with my software stack.

I want a simple interface that a tool like me can scroll through easily, click on, and then have a model that works well enough. If I put in that much brain power to get my LLM working, I might as well do the work myself instead of using an LLM in the first place.

➕ show 1 reply

alt Hacker News

Replies