logoalt Hacker News

100mstoday at 6:01 PM3 repliesview on HN

Tiny model overfit on benchmark published 3 years prior to its training. News at 10


Replies

selimthegrimtoday at 6:46 PM

It wasn't important enough to make the 11 o'clock program.

bigyabaitoday at 6:02 PM

But GPT-3.5 was benchmaxxing too.

show 1 reply
srslyTrying2hlptoday at 6:28 PM

[dead]