I feel like it's a little disingenuous to compare against full-precision models. Anyone concern...

hatthew • yesterday at 11:52 PM • 0 replies • view on HN

I feel like it's a little disingenuous to compare against full-precision models. Anyone concerned about model size and memory usage is surely already using at least an 8 bit quantization.

Their main contribution seems to be hyperparameter tuning, and they don't compare against other quantization techniques of any sort.

alt Hacker News