but there is no trained 100b param model? "can run a 100B BitNet" is about the inference implementation, not about the existence of any such model
I think they used a dummy model or else they would have linked to it. Just google '1-bit 100b model' and you'll only see references to this project without any download links.
I think they used a dummy model or else they would have linked to it. Just google '1-bit 100b model' and you'll only see references to this project without any download links.