logoalt Hacker News

ehtbantonyesterday at 11:52 PM0 repliesview on HN

I will always maintain that the best benchmark is just trying it out for yourself. The most practical parallel for me is all the people posting about how some open-source model has "achieved X on Y benchmark - beating out Opus 4.6!" It's all show and everyone cheats.