logoalt Hacker News

dannywtoday at 4:19 AM1 replyview on HN

It’s a 6bn model. Totally different class. I’m more excited about “frontier small language models” tbh.


Replies

andaitoday at 2:18 PM

It's a 119B model, 6B active.

That's still 3-10x smaller than the other models in that graph though (400B, 1T, 1.5T).