logoalt Hacker News

mootothemaxtoday at 5:53 PM0 repliesview on HN

Can any LLM give you the rough pixel coordinates of an item it identifies in an image?

I found that while Claude, GPT etc could describe an image, there was no way to link the description back to specific pixels in the image itself. Not even to a bounding box or segment.