logoalt Hacker News

kingstnaptoday at 9:07 PM0 repliesview on HN

In my experience LLMs often have really solid insights in the thinking chains then vomit a nonsense score that doesn't make sense.

Now I'm not sure if this is actually an LLM only thing. Because I think people probably do similar when you ask them to give a number to things without providing a concrete grading rubric...