logoalt Hacker News

Glemllksdfyesterday at 3:43 PM1 replyview on HN

If its really more expensive per token, it might have more parameters and is then able to hold more context/scope of code.

Rumors say it has 10 trillion parameter vs. 1 trillion.


Replies

rakejaketoday at 1:19 AM

Yes, that does track with my personal experience. More context, more params and no quantization is probably it. But my hunch is that all the training data they've been getting in the past year also plays a part here. More than any other lab, anthropic's focus on coding right from the beginning gives them access to the best training data (several githubs worth). Most of this code comes with human feedback and anthropic even has data on how many went to production, got reverted etc. No need to pay for human labeling when your customers are doing it for you. This is their secret sauce.