Now include the externalized cost in the U.S. of deploying ~100% of productive capital to build data centers instead of, for example, first-world transportation infrastructure, and tell me which one is cheaper
OpenRouter and other LLM platforms are being subsidized by VC investment to less than it costs them to run inference, the MacBook Pro is not
OpenRouter doesn't expose all the LLM sampling parameters/research that llamacpp, vllm, sglang, et al expose (so no high temperature/highly diverse outputs). Also OpenRouter doesn't let you use steering vectors or LoRA or other personalization techniques per-request. Also no true guarantees of ZDR/privacy/data sovereignty.
Oh, and the author didn't mention at all anything related to inference optimization, so no idea if they even know about or enabled things like speculative decoding, optimized attention backends, quantization, etc.
At least AI slop would have hit on far more of the things I listed above. This is worse-than-AI.
I think that the main flaw in the reasoning is assuming that cost of token will stay the same over the years.
Chances are that token prices will go down, but chances also are that the AI bubble pops and all of a sudden all these companies will either have to make a buck out of the inference or go bankrupt.
Getting your own hardware just grants you stable pricing.
[flagged]
The full-amortization framing is doing a lot of work here. I bought my laptop because I needed a laptop, not as an inference box, and running a model on it is incidental to that. Once the hardware is sunk for other reasons, the only cost left is electricity plus whatever depreciation you accelerate by hammering the SoC, which the post actually acknowledges in one parenthetical before allocating the full $4299 to tokens anyway.
Also nobody I know picks local over OpenRouter on price. They pick it for offline, for data not leaving the machine, for no rate limits, for not having a provider go down mid-task. If $/Mtok is the only axis, sure, cloud wins.
In practice the pattern I see is leaving a small model running on easy background tasks while using the laptop normally, not a dedicated inference box hammered flat out for 5 years.
[dead]
[dead]
[dead]
[dead]
... but comes with privacy guarantees *
* but Apple will collect all your keystrokes anyway