On the topic of local models, is there a good equivalent to something like Claude's chat interface? I've recently started transitioning to open models after getting fed up with Claude's usage limits (I'm not in a position to drop $200/month), and for coding tasks Kimi 2.6 has been about the same as Sonnet in my experience. The only thing I've found myself missing is a nice interface to ask it questions and have it help me with my math assignments.
Most of the common ways to run local LLMs include a chat interface. llama.cpp's `llama-server` stands up a chat interface on 8080, as well as an OpenAI compatible API. LM Studio is a desktop app with a chat interface and API, as well. unsloth Studio, too.
LM Studio is nice in that it makes it easy to add tools, like search. Qwen 3.6 is such a small model that it lacks a lot of knowledge of the world (so it can hallucinate at an uncomfortable rate, which is a common failure mode of very small models), but it can use tools, so being able to search lets it research before answering. It has pretty good reasoning and tool calling, so it's actually pretty effective. I've been comparing Gemma 4 (31B at 8-bits, also very good with tools and reasoning for its size), Qwen 3.6 (27B at 8-bits), against Claude Opus and Gemini Pro lately. And, obviously the frontiers are better, but most of the time, I find the tiny models are fine. I'm still not quite at the point where I'd be willing to code with local models, as the time wasted on hallucinations and logic bugs and sloppy coding practices are much higher, as is the cost of security bugs that make it past review.
Open WebUI or Jan (https://www.jan.ai/). Work well with Ollama.
I re-created Claude's interface closely here, feel free to fork https://github.com/mudkipdev/chat
Ollama does this, as does llama-server from llama.cpp
You can try Open WebUI. Its genuinely useful when it comes to running open models locally with a clean interface
I've been mostly using LM Studio for this recently. Ollama has an OK chat UI now too. 'brew install llama.cpp' gets you 'llama-server' which provides quite a good web UI.
With Ollama* you can use Claude Code with `ollama launch claude`
llama-server from the llama.cpp package has a local web interface.
Yes but not exactly.
- A lot of people suggesting llama-server's web ui, but that requires you use local AI (llama.cpp), it's persisting content into your browser rather than the server (so you can lose your chats), and it doesn't support much functionality.
- There are some pure-browser chat interfaces that are like llama-server but you can use remote LLMs. This is closer to what you want, but everything is stored in the browser, so backup is harder.
- There's LocalAI, which is like the llama-server option, but more stuff is built in and it persists data to disk. It's flashy and very easy if all you want to do is local AI.
- There's LM Studio, which is another thing like LocalAI, but a desktop app.
- There's OpenWebUI, where it's like LocalAI, except you don't do local inference, you use remote LLMs. It sucks to be honest, just stops working a lot of the time, UX is terrible, lots of weird bugs.
- There's OpenHands, which is more like Codex/Claude Code web UI. You run it locally and connect to remote LLMs. Kinda clunky, limited, poor design. Like most coding agents, it doesn't support all the features you would want, like LocalAI/OpenWebUI do.
- There's OpenCode's web UI, which is like OpenHands, but less crappy.
- There's Jan, which is probably what you want. It's a desktop app rather than a web UI.