Unless I'm misunderstanding, this is counting the entire laptop in the cost of generating tokens. The calculation seems to omit that, in addition to receiving LLM output, you have also received a laptop in exchange for your money. If you intend to put this machine in a dark corner and run it solely as a token-munching server, a laptop would be an exceptionally poor choice of technology for this purpose. But if you intend to use the laptop as a laptop, having a laptop is a pretty big benefit over not having a laptop.
You also get the benefit of privacy, freedom from censorship, and control over the model used (i.e. it will not be rugpulled on you in three months after you've built a workflow around a specific model's idiosyncrasies).
> control over the model used
but you lose access to the most capable models, you can run only the small ones
> in addition to receiving LLM output, you have also received a laptop in exchange for your money
And, since it's a Mac, whenever you're ready to upgrade it'll still have a fairly decent resale value.
OpenRouter can’t play Cyberpunk 2077 max setting 5K HDR!
OP is giving you the absolute best case compared to most of the people who've been overcome with psychosis hoarding Macs.
An unreasonable number of these people spent $10,000+ for Mac Studios that are still compute bottlenecked and don't have anything more efficient than Gemma 4 to run.
Yeah, a better metric might be, the difference in cost between the laptop you need to run local models, and the laptop you would have bought anyway.