I haven’t seen it discussed anywhere that closed models can essentially cheat benchmarks right? What...

cedws • today at 12:10 AM • 2 replies • view on HN

I haven’t seen it discussed anywhere that closed models can essentially cheat benchmarks right? What Anthropic or OpenAI brand as a model doesn’t necessarily have to be just weights, it can be a whole backend system that augments the model itself. With this they can score better benchmarks than an open source model that is weights alone.

Replies

jstanley • today at 3:16 AM

Sure, I think that's fine, that all counts. It counts for open source too, it's not like they're somehow running these benchmarks without any harness.

Nobody cares if your AGI is 100% made out of neural networks or if it's like 50% neural networks and 50% perl scripts.

➕ show 1 reply

snthpy • today at 2:56 AM

Good point

alt Hacker News

Replies