I tried TQ for vector search and my findings is not good, it is not worth it if you cannot use GPU, however I got same quality of search as 32f using 8bit quant
I wrote ann ext for sqlite, using tq, I do save a lot on space but 32f is still faster despite everything I have tried
you’re right that 32f is faster on raw query time, quantization adds extra step. main benefit on download size since gzip won’t help much, which matters most in browser contexts
So i assumed it would get crushed by OPQ (which requires training)