Interesting approach! One question though: can the model do column detection? The first OCR exampl...

bazzmt • today at 6:13 AM • 1 reply • view on HN

Interesting approach! One question though: can the model do column detection?

The first OCR example returns output that does not detect the article columns - the bounding box is the entire first line.

Replies

yoeven • today at 7:10 AM

It can, you could try prompting the model to use object detection vision and text extraction, we realized when we purely extract text it does amazing at word/sentence level bounds since the text acts as the anchor. However, when you treat it as a object detection problem, it sees that chunk of text as a segment allowing you the extract it as one column bound. Give that a try.

alt Hacker News

Replies