Embeddings are good at partitioning document stores at a coarse grained level, and they can be very ...

CuriouslyC • last Saturday at 11:44 PM • 1 reply • view on HN

Embeddings are good at partitioning document stores at a coarse grained level, and they can be very useful for documents where there's a lot of keyword overlap and the semantic differentiation is distributed. They're definitely not a good primary recall mechanism, and they often don't even fully pull weight for their cost in hybrid setups, so it's worth doing evals for your specific use case.

Replies

visarga • yesterday at 7:00 AM

"12+38" won't embed close to "50", as you said they capture only surface level words ("a lot of keyword overlap") not meaning, it's why for small scale I prefer a folder of files and a coding agent using grep/head/tail/Python one liners.

alt Hacker News

Replies