logoalt Hacker News

vineyardmikeyesterday at 9:55 PM1 replyview on HN

The costs of Netflix and Spotify are licensing. Offering the subscription at half price to additional users is non-cannibalizing and a way to get more revenue from the same content.

The cost of LLMs are the infrastructure. Unless someone can buy/power/run compute cheaper (Google w/ TPUs, locales with cheap electricity, etc), there won't be a meaningful difference in costs.


Replies

storystarlingtoday at 8:05 AM

That assumes inference efficiency is static, which isn't really the case. Between aggressive quantization, speculative decoding, and better batching strategies, the cost per token can vary wildly on the exact same hardware. I suspect the margins right now come from architecture choices as much as raw power costs.