logoalt Hacker News

acchowyesterday at 8:48 PM1 replyview on HN

Agreed. The graphs clearly show that opus 4.8 performs strictly better at the same cost per task


Replies

jsnellyesterday at 9:07 PM

But they don't show "strictly better" performance at cost per task!

The graphs show parts of the cost/performance pareto frontier occupied by Opus 4.8 and others occupied by Sonnet 5.0. If Opus 4.8 was strictly better at cost per task like you say, by definition the entire frontier would be occupied by Opus.

So neither is pareto-dominant over the other. In contrast, Sonnet 5.0 is Pareto-dominent over Sonnet 4.6 on those graphs.

show 1 reply