logoalt Hacker News

ACCount37yesterday at 1:33 PM1 replyview on HN

Check the token prices for open weight LLMs at various independent inference providers.

That gives you a very good estimate of "how much can you serve the tokens of a model of the size N for while making a profit".

Now, keep in mind: Kimi K2.5 is 1T MoE. Today's frontier LLMs are in the 1T to 5T range, also MoE. Make an estimate. Compare that estimate with the actual frontier lab prices.


Replies

lolcyesterday at 3:03 PM

I don't think it's as easy as looking at open weight API prices. We don't know whether the operators are making a profit on all the hardware they bought. Maybe the prices we pay just cover electricity. And it's not even certain that running costs are covered by API prices: The operators may be siphoning content and subsidize from selling that.

In the current volatile environment, the API prices are more of a baseline where we can assume it can't be much cheaper to operate these models.

show 1 reply