Not really related, but does anybody know if somebody's tracking same models performance on some benchmarks over time? Sometimes I feel like I'm being A/B tested.
Oh, I didn't think about this, that's a good idea. I also feel generally model performance changes over time (usually it gets worse).
The problem with doing this is cost. Constsntly testing a lot of models on a large dataset can get really costly.
Oh, I didn't think about this, that's a good idea. I also feel generally model performance changes over time (usually it gets worse).
The problem with doing this is cost. Constsntly testing a lot of models on a large dataset can get really costly.