> I am confused what actually happens in the vectorized ADD and MULT instructions in the GPU with these quantized numbers.
I might be wrong, but I think LLM is all about comparing distance between tokens. You can tell that -255 and +255 are very separated, but you are also away that -8 and +8 are also very far away.
Microsoft Bitnet and Google TurboQuant shows that in extreme you can use just -1, 0, +1