logoalt Hacker News

gozucitoyesterday at 11:28 AM1 replyview on HN

Can it scale to an 800 billion param model? 8B parameter models are too far behind the frontier to be useful to me for SWE work.

Or is that the catch? Either way I am sure there will be some niche uses for it.


Replies

taneqyesterday at 11:29 AM

Spam. :P

show 1 reply