Is there any strong reason to use GeoParquet instead of straight up parquet if all I'm interested in is storing and operating on lat/lons?
I'm curious if it compresses them better or something like that. I see lots of people online saying it compresses well (but mostly compared to .shp or similar) but normal parquet (.gz.parquet or .snappy.parquet) already does that really well. So it's not clear to me if I should spend time investigating it...
I mostly process normal parquet with spark and sometimes clickhouse right now.
Based on my reading of the GeoParquet spec, the main difference is that geometries are stored as WKB using Parquet's byte array type. Byte arrays can be delta-encoded. There is also some additional metadata stored like CRS and a bounding box.
When using EPSG:4326 lat/lons, I don't think GeoParquet would give you any benefits over just having separate columns (this is what I typically do, and it's plenty fast).
If you are using range requests to fetch only parts of parquet files at a time, you could potentially sort your data using a hilbert curve, which could limit the number of row groups that need to be fetched to execute a query.