This appears to be a key tool for providing search to local models.
I'm curious what setups folks use to provide this functionality.
Since the quantized 24B parameter Gemma model came out, I've had good luck with tool calling on a 4070 Ti Super.
Successful tool calling is what finally made the local experience useful.
I should note this is for the general and not coding specific context.
It has a JSON mode that you need to enable in settings and then you can create a simple python script to interact with it or have the agent use `curl` and `jq` to interact with it.
It's at the bottom of this page: https://docs.searxng.org/admin/settings/settings_search.html
I am also interested in what a full local AI stack with web search and other tools looks like. As far as I can tell, SearX does not embed an MCP server, so it can't be directly called from llama-server for example. Open WebUI does have an integration for SearX and other providers, but the results I obtained weren't particularly impressive.
are you running a quant?
i have a friend with a 4080 that is wanting to experiment with local models and those cards should be similar enough. can you give any more detail about your setup? ty!
TinySearch MCP. If you use Unsloth Studio however it simply calls DuckDuckGo's HTML API instead and works pretty well.