Zvec: A lightweight, fast, in-process vector database

211 points • by dvrp • last Friday at 8:53 AM • 34 comments • view on HN

Comments

Their self-reported benchmarks have them out-performing pinecone by 7x in queries-per-second: https://zvec.org/en/docs/benchmarks/

I'd love to see those results independently verified, and I'd also love a good explanation of how they're getting such great performance.

➕ show 4 replies

luoxiaojian • today at 12:36 PM

Author here. Thanks everyone for the interest and thoughtful questions! I've noticed many of you are curious about how we achieved the performance numbers and how we compare to other solutions. We're currently working on a detailed blog post that walks through our optimization journey—expect it after the Lunar New Year. We'll also be adding more benchmark comparisons to the repo and blog soon. Stay tuned!

➕ show 1 reply

clemlesne • yesterday at 10:11 PM

Did someone compared with uSearch (https://github.com/unum-cloud/USearch)?

➕ show 1 reply

aktuel • today at 7:19 AM

I recently discovered https://www.cozodb.org/ which also vector search built-in. I just started some experiments with it but so far I'm quite impressed. It's not in active development atm but it seems already well rounded for what it is so depending on the use-case it does not matter or may even be an advantage. Also with today's coding agent it shouldn't be too hard to scratch your own itch if needed.

OfficialTurkey • today at 1:26 AM

I haven't been following the vector db space closely for a couple years now, but I find it strange that they didn't compare their performance to the newest generation serverless vector dbs: Pinecone Serverless, turbopuffer, Chroma (distributed, not the original single-node implementation). I understand that those are (mostly) hosted products so there's not a true apples-to-apples comparison with the same hardware, but surely the most interesting numbers are cost vs performance.

cjonas • today at 12:56 AM

How does this compare to duckdbs vector capabilities (vss extension)?

➕ show 1 reply

_pdp_ • yesterday at 11:53 PM

I thought you need memory for these things and CPU is not the bottleneck?

➕ show 1 reply

skybrian • yesterday at 11:14 PM

Are these sort of similarity searches useful for classifying text?

➕ show 5 replies

dmezzetti • today at 2:47 PM

Very interesting!

It would be great to see how it compares to Faiss / HNSWLib etc. I'd will consider integrating it into txtai as an ANN backend.

wittlesus • today at 4:59 AM

Genuine question for anyone running in-process vector search in production: when do you reach for something like this vs. an external service?

The appeal of in-process is obvious — no network hop, simpler deployment, lower latency on small-to-medium datasets. But I'm curious about the operational story. How do you handle index updates while serving queries? Is there a write lock during re-indexing, or can you do hot swaps?

The mceachen comment about sqlite-vec being brute-force is interesting too. For apps with under ~100K embeddings, does the algorithmic difference even matter practically, or is the simpler deployment story more important than raw QPS?

➕ show 3 replies

alt Hacker News

Zvec: A lightweight, fast, in-process vector database

Comments