logoalt Hacker News

rahimnathwaniyesterday at 8:37 PM0 repliesview on HN

Right, if you look at PDF files from Internet Archive, they're usually compressed with MRC (Mixed Raster Content).

IIRC each page has three layers:

- background (jpeg, color)

- foreground (jbig2, monochrome maybe?)

- mask (indicating whether foreground or background should be shown at this point)

https://github.com/internetarchive/archive-pdf-tools