logoalt Hacker News

comboyyesterday at 8:23 AM1 replyview on HN

Not really related, but does anybody know if somebody's tracking same models performance on some benchmarks over time? Sometimes I feel like I'm being A/B tested.


Replies

XCSmeyesterday at 8:27 AM

Oh, I didn't think about this, that's a good idea. I also feel generally model performance changes over time (usually it gets worse).

The problem with doing this is cost. Constsntly testing a lot of models on a large dataset can get really costly.

show 1 reply