logoalt Hacker News

Gzip decompression in 250 lines of Rust

90 pointsby vismit2000last Tuesday at 6:35 AM33 commentsview on HN

Comments

stgntoday at 3:39 PM

> so i wrote a gzip decompressor from scratch

After skimming through the author's Rust code, it appears to be a fairly straightforward port of puff.c (included in the zlib source): https://github.com/madler/zlib/blob/develop/contrib/puff/puf...

show 2 replies
nayukitoday at 3:02 PM

Just like that author, many years ago, I went through the process of understanding the DEFLATE compression standard and producing a short and concise decompressor for gzip+DEFLATE. Here are the resources I published as a result of that exploration:

* https://www.nayuki.io/page/deflate-specification-v1-3-html

* https://www.nayuki.io/page/simple-deflate-decompressor

* https://github.com/nayuki/Simple-DEFLATE-decompressor

Lerctoday at 4:40 PM

The function

  fn bits(&mut self, need: i32) -> i32 { ....
Put me in mind of one of my early experiments in Rust. It would be interesting to compare a iterator based form that just called .take(need)

I haven't written a lot of Rust, but one thing I did was to write an iterator that took an iterator of bytes as input and provided bits as output. Then used an iterator that gave bytes from a block of memory.

It was mostly as a test to see how much high level abstraction left an imprint on the compiled code.

The dissasembly showed it pulling in 32 bits at a time and shifting out the bits pretty much the same way I would have written in ASM.

I was quite impressed. Although I tested it was working by counting the bits and someone critizised it for not using popcount, so I guess you can't have everything.

show 2 replies
MisterTeatoday at 2:51 PM

> twenty five thousand lines of pure C not counting CMake files. ...

Keep in mind this is also 31 years of cruft and lord knows what.

Plan 9 gzip is 738 lines total:

  gzip.c 217 lines
  gzip.h 40 lines
  zip.c  398 lines
  zip.h  83 lines
Even the zipfs file server that mounts zip files as file systems is 391 lines.

edit - post a link to said code: https://github.com/9front/9front/tree/front/sys/src/cmd/gzip

> ... (and whenever working with C always keep in mind that C stands for CVE).

Sigh.

show 2 replies
carlos256today at 5:07 PM

>the only flag we care about is FNAME The specification does not define an encoding for the file name. Different file systems may impose restrictions on certain names, so FNAME should not be used.

up2isomorphismtoday at 3:17 PM

Another dev who doesn’t show respect to what has been done and expect a particular language will do wonders for him. Also I don’t see this is much better in term of readability.

show 2 replies
jeffrallentoday at 3:00 PM

But probably without any error checking.

Feels like Rust culture inherited "throw and forget" as an error handling "strategy" from Java

Sigh.

show 3 replies