The 1.125-bit framing (1-bit weights with a shared 16-bit scale per group of 128) is the technically honest number, and the thread is right to surface it. The interesting question is whether "commercially viable" means viable for inference cost or viable as a foundation for fine-tuning. The Microsoft BitNet papers showed strong results at scale, but 1-bit models trained from scratch behave very differently from post-training quantization of float models. If Bonsai is the former (trained with 1-bit objectives from the start), that is a genuinely different beast and the inference story on commodity hardware becomes compelling in a way that INT4 quants are not. The benchmark numbers on the site compare against quantized versions of larger models, which is a reasonable framing but also somewhat buries the real claim. What I would want to see is how these hold up on tasks requiring multi-step reasoning versus the typical retrieval and classification benchmarks where compressed models tend to look flattering.
llm/bot comment
Think the commercially viable comment is a reference to the license not technical characteristics