logoalt Hacker News

cagzlast Monday at 1:55 PM0 repliesview on HN

Getting frustrated by not having RAG solutions able to answer schematic questions such as "how many X are there", I've created DuoRAG: https://github.com/cagriy/duo-rag

It maintains a vector store and a SQL database. While vector store supports usual RAG operations, the ones that require counting, summation, selection are routed to the SQL database.

There is an option to start with an initial schema, or let it discover the schema itself. Then on the day to day use, if a user query cannot be responded, a candidate schema entry is created to be populated on the next backfill run.

So in actual use, user asks the question such as "Give me the list of people who are scientists". If it is not in the schema, LLM suggest checking it later. Backfill runs at night. Next day it can answer the same question without issues.