First model to get 100% on my agentic benchmark:

nl • yesterday at 11:51 PM • 1 reply • view on HN

First model to get 100% on my agentic benchmark: https://sql-benchmark.nicklothian.com/?highlight=anthropic_c...

Replies

grok-4.1-fast is the the number 2 model on this benchmark.

~~If you've used this model in real life to do any sort of programming, and have seen its output, you would know that there is something VERY wrong with your benchmark.~~

Edit: Oh sorry, I looked at the questions, I see this is also for SQL specifically. Interesting. Maybe they tuned that grok model for SQL. Cool site. I bookmarked it.

➕ show 1 reply

alt Hacker News

Replies