It's not a methodology problem, it's a test-ability problem. LLMs are not deterministic. Y...

ToucanLoucan • today at 1:53 PM • 1 reply • view on HN

It's not a methodology problem, it's a test-ability problem. LLMs are not deterministic. You can ask the same question to the same LLM five times and you'll likely get at least 3 answers.

Again. Slot machine.

Replies

Ukv • today at 2:03 PM

You can meaningfully test if one slot machine hits the jackpot more often than another, just that the methodology should involve a large number of repeats rather than a few anecdotes. There are some LLM leaderboard sites that do it with blind comparisons.

alt Hacker News

Replies