logoalt Hacker News

teaearlgraycoldyesterday at 8:10 AM0 repliesview on HN

It’s tough to write good questions for LLM evaluations. They’re so good at picking up subtleties they can pass a multiple choice test when given only the answers and not the questions.