logoalt Hacker News

PunchyHamsteryesterday at 12:20 PM1 replyview on HN

I'm sure with benchmarks like these future LLMs will be optimized to hide regressions by "fixing" test framework too


Replies

pixl97yesterday at 3:24 PM

Isn't misalignment great.