You could maybe then do a second pass on the whole text (as plain text not OCR) to look for likely m...

HPsquared • yesterday at 4:24 PM • 1 reply • view on HN

You could maybe then do a second pass on the whole text (as plain text not OCR) to look for likely mistakes.

Replies

This is not always easy. The models I tried were too helpful and rewrote too much instead of fixing simple typos. When I tried I ended up with huge prompts and I still found sentences where the LLM was too enthusiastic. I ended up applying regexes with common typos and accepted some residual errors. It might be better now, though. But since then I’ve moved to all-in-one solutions like Mathpix and Mistral-OCR which are quite good for my purpose.

alt Hacker News

Replies