logoalt Hacker News

Fnoordtoday at 3:39 PM1 replyview on HN

There's very good OCR models. Then it becomes a matter of which letter is which. In Latin script there's only 26 possibilities, and then there's numbers and symbols.

1) https://mistral.ai/news/mistral-ocr-3


Replies

gunalxtoday at 5:30 PM

not as simple as just OCR and map though. Some letters want space above them some want to be placed lower.

take g and f and c for examples

g and f are about the same height but different ofsets, and c would look like a capital C if scaled to the same size as g and f. (we probably want to auto adjust scales to match more evenly unless the text is on a grid (in case removing the grid is the difficulty)

These are just the difficulty I found by trying to make a more automated input to fontforge.

show 2 replies