Currently it costs so much more to host an open model than it costs to subscribe to a much better hosted model. Which suggests it’s being massively subsidised still.
Efficiency goes way up with concurrent requests, so not necessarily subsidy, could just be economy of scale.
You can use open models through OpenRouter, but if you want good open models they’re actually pretty expensive fairly quickly as well.
If I drop $10k on a souped-up Mac Studio, can that run a competent open-source model for OpenClaw?
For a lot of tasks smaller models work fine, though. Nowadays the problem is less model quality/speed, but more that it's a bit annoying to mix it in one workflow, with easy switching.
I'm currently making an effort to switch to local for stuff that can be local - initially stand alone tasks, longer term a nice harness for mixing. One example would be OCR/image description - I have hooks from dired to throw an image to local translategemma 27b which extracts the text, translates it to english, as necessary, adds a picture description, and - if it feels like - extra context. Works perfectly fine on my macbook.
Another example would be generating documentation - local qwen3 coder with a 256k context window does a great job at going through a codebase to check what is and isn't documented, and prepare a draft. I still replace pretty much all of the text - but it's good at collecting the technical details.