logoalt Hacker News

infectoyesterday at 8:49 PM0 repliesview on HN

So you are suggesting building a full featured package that is nontrivial compared to this fun excitement?

Vision models do a pretty decent job with spatial reasoning. It’s not there yet but you’re dismissing some interesting work going on.