logoalt Hacker News

alansabertoday at 11:26 AM0 repliesview on HN

I appreciate your reply but you are completely glossing over his point about how head to head model evals are useless lmao