logoalt Hacker News

mxmlnknyesterday at 4:07 PM0 repliesview on HN

> Apparently the internal state is only 32kB

Exactly. And often this state is either highly compressible or non-compressible but only sparsely used. The latter can then be made compressible by replacing the unused bytes with zeros.

Ratarmount uses indexed_gzip, and when parallelization makes sense, it also uses rapidgzip. Rapidgzip implements the sparsity analysis to increase compressibility and then simply uses the gztool index format, i.e., compresses each 32 KiB using gzip itself, with unused bytes replaced with zeros where possible.

indexed_gzip, gztool, and rapidgzip all support seeking in gzip streams, but all have some trade-offs, e.g., rapidgzip is parallelized but will have much higher memory usage because of that than indexed_gzip or gztool. It might be possible to compile either of these to WebAssembly if there is demand.