logoalt Hacker News

rmbyrrotoday at 6:07 PM0 repliesview on HN

I agree with you for some kinds of images, but not all.

LLMs are the best PDF-to-markdown converters, in my experience. I have a CLI that converts PDF to PNG, then run a background agent to "read" each PNG and write it down as markdown; it works flawlessly even for complex math formulas, it can "translate" complex charts, graphs, and tables into words.

It's slow and arguably expensive compared to traditional OCR, but very effective and precise.