logoalt Hacker News

faangguyindiatoday at 12:23 AM1 replyview on HN

>Is it possible to ask the vision agent to "map"

No most vision models focus on subset of an image at a time when using image -> text

image -> image uses whole image.


Replies

esperenttoday at 10:34 AM

> No most vision models focus on subset of an image at a time when using image -> text

Is this true? Where can I read more about it?