Ok that's...just cheating. You can't take a benchmark like MMLU designed to test the perfo...

gok • today at 12:32 AM • 1 reply • view on HN

Ok that's...just cheating. You can't take a benchmark like MMLU designed to test the performance of a single general language model and compare it to performance of a small specialized model designed to do well on MMLU.

Replies

yoeven • today at 1:28 AM

It wasn't designed to do well on MMMLU, it's a general model designed for deterministic task like OCR, object detection, STT and more and a by product of that is great language abilities. It still has a transformer backbone giving great language skills while being good at other stuff.

See the full benchmark: https://interfaze.ai/leaderboards

alt Hacker News

Replies