logoalt Hacker News

harrouetyesterday at 11:29 AM0 repliesview on HN

Running LLMs locally is one way to realize the level of hardware and infrastructure that frontier AI companies are running. Makes me wonder about future strategies.

As one commenter mentioned, 2x Mac Studio M3 Max with 512GB can run frontier models and it costs $30k (with RDMA). Apply an efficiency ratio for being in a datacenter, and you understand why OpenAI and the likes spend north of $10k _per customer_ of CAPEX.

Add to that the electricity costs and you've got a very shaky business model. I for one would like to thank the VC for subsidizing my tokens.

With that said, the VCs are not crazy and probably factored in an annual cost decrease of computing power. But how do you make sure that we won't run local LLMs when the HW becomes affordable -- if ever ?

The answer has always been the same in our industry: vendor lock-in. They are getting the users now at a loss, hoping for future captive revenues.

So, be careful when your code maintenance requires the full context that yielded that code, and that this context is in [Claude Code|Codex|Cursor].