logoalt Hacker News

ainchtoday at 12:42 AM0 repliesview on HN

It's a good paper by Hooker but that specific comparison is shoddy. Llama and Aya were both trained by significantly more competent labs on different datasets to Falcon and BLOOM. The takeaway there is "it doesn't matter if you have loads of parameters if you don't know what you're doing."

If we compare apples-to-apples, eg. across Claude models, the larger Opus still happily outperforms it's smaller counterparts.