logoalt Hacker News

charcircuityesterday at 4:14 PM1 replyview on HN

I mentioned coding as a use case in my comment you replied to. You were asking for an example for when one wouldn't use vector search and I provided one. I did not say similarity search would be a substitute. I said that for the coding case you do not need it.

>you would still likely want to build a similarity search engine

In practice tools like Claude Code, Codex, Gemini, Kimi Code, etc are getting away with searching for code with grep / find and understanding code by loading a sufficient amount of code into the context window. It is sufficient to understand higher level concepts in the code. The extra complexity of maintaining vector database top of this is not free and requires extra complexity.


Replies

menaerustoday at 6:23 AM

In your point you said "There is more to searching than building a basic similarity search." which assumed and implied all kinds of things and which was completely unnecessary.

> In practice tools like Claude Code, Codex, Gemini, Kimi Code, etc are getting away with searching for code with grep / find and understanding code by loading a sufficient amount of code into the context window

Getting away is the formulation I would use as well. "Sufficient amount" OTOH is arguable and subjective. What suffices in one usage example, it does not in another, so the perception of how sufficient it really is depends on the usage patterns, e.g. type and size of the codebases and actual queries asked.

The crux of the problem is what amount and what parts of the codebase do you want to load into the context while not blowing up the context and while still maintaining the capability of the model to be able to reason about the codebase correctly.

And I find it hard to argue that building the vector database would not help exactly in that problem.