logoalt Hacker News

simianwordstoday at 8:30 AM5 repliesview on HN

https://artificialanalysis.ai/models/grok-4-3


Replies

nextaccountictoday at 9:09 AM

This puts Sonnet 4.6 above Opus 4.6 in the coding index.. kinda hard to trust those numbers.

(Also it puts Opus 4.7 universally above Opus 4.6, and I may be wrong but this doesn't seem to match the experience of most/many/some people. I think it's widely recognized that Anthropic is severely lacking compute and Opus 4.7 is a costs saving measure)

show 2 replies
Alifatisktoday at 9:17 AM

Does numbers don't look exciting at all? I may have gotten spoiled by releases from Qwen, Kimi and Z.ai who keep closing the gap between closed weight SOTA models and open weight. From my experience, Grok is only useful for one thing, and that's looking up things for you and gathering a consensus on topics. That's it.

Update, I noted that Grok 4.3 is in the "Most attractive quadrant", that's cool! It is also in the top 5 highest in "AA-Omniscience Index", good! Really good.

progbitstoday at 9:18 AM

What's with the charts and numbers?

It says #1 for speed but then in the chart it's #2. Also says #10 for intelligence but then it's #7 in the chart.

BoorishBearstoday at 9:08 AM

What an exciting game we're playing, where the most popular leaderboard is completely made up and the stakes are in the trillions.