TurboQuant is known across the industry to not be state of the art. There are superior schemes for KV quant at every bitrate. Eg, SpectralQuant: https://github.com/Dynamis-Labs/spectralquant among many, many papers.
> Given that TurboQuant results in a 6x reduction in memory usage for KV caches
All depends on baseline. The "6x" is by stylistic comparison to a BF16 KV cache; not a state of the art 8 or 4 bit KV cache scheme.