Why does anthropic change the set of benchmarks they use with every new model release?

irthomasthomas • yesterday at 5:44 PM • 1 reply • view on HN

pietz • yesterday at 5:55 PM

1. Benchmarks saturate 2. They select the most impressive improvments

alt Hacker News