Just look at large open weights models being served by inference providers. Kimi 2.6 is a 1 trilli...

mediaman • yesterday at 11:32 PM • 1 reply • view on HN

Just look at large open weights models being served by inference providers.

Kimi 2.6 is a 1 trillion total / 32B active parameter model that's something comparable to Sonnet. Sonnet's API pricing is $5 in, $15 out per million tokens. Deepinfra serves Kimi at $0.75 in, $3.50 out, and about the same at openrouter. So you're looking at a 4-7x multiple that Anthropic is charging compared to market rates that any plebe can get with a credit card.

Replies

majormajor • today at 1:02 AM

I'm not sure just how good that looks for Anthropic/OpenAI.

4-7x isn't a tiny markup, but how does that compare to high-margin internet businesses like AdSense? Meta and Google do hundreds of billions in ad revenue a year, and after taking out the publisher's portion (60-80% per some searching), I wonder what the ratio of the remaining tens-of-billions is against the compute cost and headcount required to run it.

And how much room for maintaining or improving that margin do they have if the cheap competitors also continue getting better? Is there a "good enough" point where the easier inference tasks are all moving to vendors massively undercutting them, and then they don't have the volume necessary to justify spending on further cutting-edge development?

alt Hacker News

Replies