you can pull directly from huggingface with llama.cpp, and it also has a decent web chat included
Does it have a model registry with an API and hot swapping or you still have to use sometime like llama swap as suggested in the article ? Or is it CLI?
Does it have a model registry with an API and hot swapping or you still have to use sometime like llama swap as suggested in the article ? Or is it CLI?