Has anyone experiment with using VLM to detect "marks"? Thinking of pen/pencil based ...

sinandrei • yesterday at 5:16 PM • 1 reply • view on HN

Has anyone experiment with using VLM to detect "marks"? Thinking of pen/pencil based markings like underlines, circles,checkmarks.. Can these models do it?

Replies

leetharris • yesterday at 5:31 PM

None of them do it well from our experience. We had to write our own custom pipeline with a mixture of legacy CV approaches to handle this (AI contract analysis). We constantly benchmark every new multimodal and VLM model that comes out and are consistently disappointed.

➕ show 1 reply

alt Hacker News

Replies