If the data is in Parquet they are already indexed in a sense. No further indexing necessary. If t...

wenc • 05/04/2025 • 1 reply • view on HN

If the data is in Parquet they are already indexed in a sense. No further indexing necessary.

If they are stored in DuckDB’s native format (which I don’t use), it supports some state of the art indices.

https://duckdb.org/docs/stable/sql/indexes.html

I find Parquet plenty fast though.

Replies

touisteur • 05/06/2025

Ah thanks, of course. I was thinking of dealing with millions of (Geo)JSON files adding up to terabytes, without copying/duplicating them though, mostly indexing. I used to do that with postgres foreign data wrappers and had hopes for duckdb :-). But that's a question for SO or other forum.

alt Hacker News

Replies