logoalt Hacker News

conceptionyesterday at 6:01 PM2 repliesview on HN

Probably explains why Opus was trash for the last week - https://marginlab.ai/trackers/claude-code/. Curious if the new baseline will rise now in-line with the new benchmarks.


Replies

hedorayesterday at 6:05 PM

Nice. Can you release that for older models too? I've been using a mixture of releases recently, and cannot tell the difference between any of them.

show 1 reply
geoffbpyesterday at 9:50 PM

This is cool. Thanks for sharing!