logoalt Hacker News

purple-leafytoday at 3:34 AM2 repliesview on HN

Benchmarks are great, but I feel like there’s a better way this seems quite subjective.

What you really need is an objective benchmark


Replies

elitoday at 3:53 AM

I actually really like subjective benchmarks, so long as it's a human (ideally me) grading the results. LLM as judge never made much sense.

show 1 reply
echelontoday at 3:46 AM

> What you really need is an objective benchmark

"When are all the software engineers unemployed?"

show 1 reply