logoalt Hacker News

breadislovetoday at 1:26 AM1 replyview on HN

very bad take. with most modern multomodal models you get way better performance then going to text first


Replies

emil_sorensentoday at 8:26 AM

it's a cost/latency trade-off in production + very use-case dependent