logoalt Hacker News

kristopoloustoday at 1:00 AM1 replyview on HN

Okay I think there's a familiarity delta. I constantly run into this

I know artificial analysis quite well as the gold standard in llm evals.

But I guess they're still obscure

I didn't think they were.

The age is important because new techniques keep being developed and so it is a very rough indicator of the size/cost/efficiency trade-off.

How old a model is is a major indicator of what you can expect from it.

I really need to develop a better sense for what people know. That's only one of my problems

Thanks for engaging with me


Replies

mrbungietoday at 9:24 AM

> I know artificial analysis quite well as the gold standard in llm evals.

I also know them, but it took me a while to realise you were publishing their data in that table. I don't think it was clear.

> The age is important because new techniques keep being developed and so it is a very rough indicator of the size/cost/efficiency trade-off.

Yes but you are already including the name of the model, your potential public for the table already know about model's release history and therefore each model's age, at least roughly.