SearXNG is my daily internet search now +5 years; with YaCY Backends and else as fallback. I also build internal document search or RAG applications with this setup (SearXNG also support json results). However, there are some downer I accept because of privacy: 1. Its slower and the results are not that good then with others. But fast and good enough for most of my queries. 2. From time to time you get blocked on the duckduckgo, brave or whatever search and you must solve some captures. You can prevent this by getting and using API-Keys from them.
The nice thing about using your own backend is, that you can prio it in the results and for example, if I crawl the smallweb and other site important for myself, this sites come up first in the results.
> SearXNG is my daily internet search now +5 years
Same here
> with YaCY Backends and else as fallback.
Do you run your own "super fast" YaCy instance? or with specific settings?
My experience with YaCy is it doesn't fit in the backend of SearX since YaCy kind of slowly stream results for about 30 seconds...
I also have a local `kiwix-serve` serving ZIM files of wikipedia, wiktionary, gutemberg, archwiki, etc. but same problem the kiwix search engine [0] doesn't really fit as a backend for SearX as it returns too many results and pollute the SearX result page.
What I haven't done yet is trying to plug SearX to a local Recoll instance [1]. But Recoll doesn't support indexing ZIM files... but could be useful for other archived html documents.
I would be curious to know more about a working setup since search is hard to get right.
- [0] https://kiwix-tools.readthedocs.io/en/latest/kiwix-serve.htm...
- [1] https://docs.searxng.org/dev/engines/online/recoll.html