Also, there is zero reason to think that the big labs did not have anything similar to TurboQuant for a long time already.
The recent blog post from Google announcing TurboQuant does not change anything regarding RAM planning for the big labs.
TurboQuant itself is already a year old! So even smaller labs have probably seen and implemented it.
The open source tooling got quantization support 3 years ago! It was a lesser type of quantization, but more than enough to prove that the savings just go to bigger models.
TurboQuant has a specific benefit by compressing the KV cache at a negligible cost to quality. That mainly means that the context lengths can go up in models for the same amount of memory, however the KV cache only accounts for something like 20% of the overall model size, and this will not dramatically decrease memory demands in the way that some of the more sensationalist reporting has stated.