Parquet files are already built for append only. Just add a new file.
This is a new paradigm for folks who aren’t in big data — the conventional approach usually involves doing a row INSERT. In big data, appending simply means adding a new file - the database engine will immediately recognize its presence. This is why “select * from ‘*.parquet’” will always operate on the latest dataset.
Parquet files are already built for append only. Just add a new file.
This is a new paradigm for folks who aren’t in big data — the conventional approach usually involves doing a row INSERT. In big data, appending simply means adding a new file - the database engine will immediately recognize its presence. This is why “select * from ‘*.parquet’” will always operate on the latest dataset.