This is essentially tool use with a filesystem interface — the LLM decides what to read instead of a...

kangraemin • yesterday at 1:07 PM • 1 reply • view on HN

This is essentially tool use with a filesystem interface — the LLM decides what to read instead of a retrieval pipeline choosing for it. Clean idea, and it sidesteps the chunking problem entirely.

Curious about the latency though. RAG is one round trip: embed query, fetch chunks, generate. This approach seems like it needs multiple LLM calls to navigate the tree before it can answer. How many hops does it typically take, and did you have to do anything special to keep response times reasonable?

Replies

jimbokun • yesterday at 2:14 PM

In their case it was competing with cloning an entire repo before starting a session which was taking 10s of seconds.

alt Hacker News

Replies