logoalt Hacker News

reedlawtoday at 6:52 PM2 repliesview on HN

This is a fine and useful project, but my experience with newly printed classics is the quality is inferior for a number of reasons. Besides paper and binding, typesetting is something that older editions rarely messed up, but some new editions create a facsimile by scanning all the pages and then re-printing. That means that instead of getting the crisply defined letters of an old printing press, you get fuzzy letters and scan artifacts. This (https://printableclassics.com/harvard_classics) shows what I mean. Not only is the typesetting quality worse, but the price is much higher for the new edition. I don't have a problem with the price on Printable Classics ($885 for a new 50 volume set is reasonable), but you could often find the same thing cheaper used. A used set is $300-$600 on ebay. The value of these PDFs is that you could make a higher quality edition as long as the text is OCR'ed and properly typeset (which is true of the Moby Dick version on the site). For the scanned copies, it would be a big undertaking to re-typeset, but I'm sure LLMs could help.


Replies

rahimnathwanitoday at 8:08 PM

I wonder how good a job the ClearScan feature on Adobe Acrobat would do. IIRC it creates one or more fonts based the existing characters in the PDF. So each lowercase 'a' would look the same, and be a sort of average of all the 'a' letters in the book.

show 1 reply
bookman10today at 7:17 PM

I agree. I wish I had the time to retypeset those. I would be concerned though with mildew/mold on old, used versions though.

show 1 reply