Does it have a model registry with an API and hot swapping or you still have to use sometime like ll...

speedgoose • today at 6:55 AM • 1 reply • view on HN

Does it have a model registry with an API and hot swapping or you still have to use sometime like llama swap as suggested in the article ? Or is it CLI?

Replies

dminik • today at 7:08 AM

You can have multiple models served now with loading/unloading with just the server binary.

https://github.com/ggml-org/llama.cpp/blob/master/tools/serv...

➕ show 1 reply

alt Hacker News

Replies