logoalt Hacker News

adrian_btoday at 11:32 AM1 replyview on HN

What you say was true in the past.

As other posters report, now llama-server implements an OpenAI compatible API and you can also connect to it with any Web browser.

I have not tried yet the OpenAI API, but it should have eliminated the last Ollama advantage.

I do not believe that the Ollama "curated" models are significantly easier to use for a newbie than downloading the models directly from Huggingface.

On Huggingface you have much more details about models, which can allow you to navigate through the jungle of countless model variants, to find what should be more suitable for yourself.

The fact criticized in TFA, that the Ollama "curated" list can be misleading about the characteristics of the models, is a very serious criticism from my point of view, which is enough for me to not use such "curated" models.

I am not aware of any alternative for choosing and downloading the right model for local inference that is superior to using directly the Huggingface site.

I believe that choosing a model is the most intimidating part for a newbie who wants to run inference locally.

If a good choice is made, downloading the model, installing llama.cpp and running llama-server are trivial actions, which require minimal skills.


Replies

justinclifttoday at 12:28 PM

> On Huggingface you have much more details about models...

For a (brand new!) newbie, it's very, very likely to be information overload.

They're still at the start of their journey, so simple tends to be better for 90% of users. ;)