logoalt Hacker News

RDTvlokiptoday at 2:20 PM1 replyview on HN

I have a question, as it happens: Do you think the benchmarks and models were trained on benchmark datasets to skew the results, even though in real-world applications we realize they're not that great?


Replies

sinuhe69today at 5:09 PM

Recent incident with the Rio 3.5 model clearly shows that many coding models are specifically trained/fine tuned for the benchmarks.