https:&#x... | alt Hacker News

simianwords • today at 8:30 AM • 5 replies • view on HN

https://artificialanalysis.ai/models/grok-4-3

Replies

This puts Sonnet 4.6 above Opus 4.6 in the coding index.. kinda hard to trust those numbers.

(Also it puts Opus 4.7 universally above Opus 4.6, and I may be wrong but this doesn't seem to match the experience of most/many/some people. I think it's widely recognized that Anthropic is severely lacking compute and Opus 4.7 is a costs saving measure)

➕ show 2 replies

Alifatisk • today at 9:17 AM

Does numbers don't look exciting at all? I may have gotten spoiled by releases from Qwen, Kimi and Z.ai who keep closing the gap between closed weight SOTA models and open weight. From my experience, Grok is only useful for one thing, and that's looking up things for you and gathering a consensus on topics. That's it.

Update, I noted that Grok 4.3 is in the "Most attractive quadrant", that's cool! It is also in the top 5 highest in "AA-Omniscience Index", good! Really good.

progbits • today at 9:18 AM

What's with the charts and numbers?

It says #1 for speed but then in the chart it's #2. Also says #10 for intelligence but then it's #7 in the chart.

BoorishBears • today at 9:08 AM

What an exciting game we're playing, where the most popular leaderboard is completely made up and the stakes are in the trillions.