Also it's not just about running an obviously worse quant. Running different GPU kernels ...

KeplerBoy • today at 9:04 AM • 0 replies • view on HN

Also it's not just about running an obviously worse quant.

Running different GPU kernels / inference engines also matters. It's easy to write an implementation that is faster and thus cheaper but numerically much noisier / less accurate.

alt Hacker News