Yeah, "1.58 bit" is 1 trit with three states, since log2(3)≈1.58.
So it's not a inference framework for 1-bit models (two states per parameter) but for 1.58 bit models (three states per parameter). Annoying that they try to mix up the two.
I always hope for "just a bunch of if statements" ... this is not it.
I always hope for "just a bunch of if statements" ... this is not it.