A bit misleading to say they take 14x less memory, no one is doing inference with 16-bit models.

WhitneyLand • today at 4:07 PM • 0 replies • view on HN

alt Hacker News