logoalt Hacker News

efavdbtoday at 1:16 AM1 replyview on HN

Article says this misses important details, eg data that might be in the image.


Replies

breadislovetoday at 1:26 AM

very bad take. with most modern multomodal models you get way better performance then going to text first

show 1 reply