Yes, on Athena, we process much larger CSV files. But the cost is too crazy. We also have ORC and Pa...

aynyc • 05/06/2025 • 0 replies • view on HN

Yes, on Athena, we process much larger CSV files. But the cost is too crazy. We also have ORC and Parquet files for other dataset which we process with EMR Spark. I really want to get off those distributed analytic engines whenever possible.

I have to think about partition, Spark/Athena both had issues with partitioning by received date. They are scanning way too much data.

alt Hacker News