logoalt Hacker News

UltraSaneyesterday at 2:17 PM1 replyview on HN

Amazon S3 Select enables SQL queries directly on CSV, JSON, or Apache Parquet objects, allowing retrieval of filtered data subsets to reduce latency and costs


Replies

staticassertionyesterday at 2:25 PM

S3 Select is, very sadly, deprecated. It also supported HTTP RANGE headers! But they've killed it and I'll never forgive them :)

Still, it's nbd. You can cache a billion Parquet header/footers on disk/ memory and get 90% of the performance (or better tbh).

show 1 reply