logoalt Hacker News

redox99yesterday at 6:01 PM0 repliesview on HN

Vision is still much weaker than text for LLMs. So you could argue we already have AGI for text but not vision inputs, or you could argue AGI requires being human level at text vision and sound.