I actually really like subjective benchmarks, so long as it's a human (ideally me) grading the ...

eli • today at 3:53 AM • 1 reply • view on HN

I actually really like subjective benchmarks, so long as it's a human (ideally me) grading the results. LLM as judge never made much sense.

charcircuit • today at 4:50 AM

The issue is that you can't do unsupervised learning if you require humans.

➕ show 1 reply

alt Hacker News