logoalt Hacker News

ac29yesterday at 8:14 PM1 replyview on HN

Your benchmark has Opus 4.7 performing significantly worse than Sonnet 4.6. Even if true on your benchmark, that is not representative of the overall performance of the models.


Replies

guilamuyesterday at 8:22 PM

Yes Opus 4.7 fast (no reasoning) did a worst job than Sonnet 4.6 high (with reasoning) according to Gemini 3.1 Pro evaluation.

show 1 reply