I think this is a compelling argument, but I think 2 issues:
1. I remain unconvinced LocalAI can work well for majority of businesses. It looks vaguely comparable on benchmarks, but it tends to be fragile and a lot of management overhead in reality.
2. Similarly, while Deepseek is comparable to Opus/Codex on benchmarks, for agentic work at scale I definitely notice the difference. That's not to say it's not economical, just that I definitely miss the big boys when I swap.
I kind of wish this was true, because the UK would be in a great place to compete with the US. But somehow people are happy to pay 3x the salary for an engineer in SF.
Fair points. I used to think that until some months ago but the latest generation of OSS models are surprisingly good. Plus maybe it is the way I work, but I find myself constantly overriding the decisions of frontier LLMs (because they start degenerating towards god objects and spaghettification) so most use I have gotten out of the AI agents is really their ability to code quickly and syntactically correctly.
Also worth noting that it doesn't have to be full either-or, there can be a two tier enterprise deployment that routes to locally hosted vs frontier model, over time more and more usecases could get routed to local LLM
I wish Deepseek could read images. I've been having good luck guiding it around on personal projects, but anything that needs to render to a screen really needs to be looked at to see bugs.
> It looks vaguely comparable on benchmarks, but it tends to be fragile and a lot of management overhead in reality.
I'm working on an self-hostable LLM (web) UI[0] that aims to provide a comparable good UX to e.g. ChatGPT, and you are right that there is a decent amount of fragility involved, and more management overhead than most people would expect.
However, we usually find that those details happen a lot more in e.g. the harness (= out application), or some prompt tuning that's required for each of the models, rather than model quality itself. We have seen customers using self-hosted LLMs with similar user satisfaction across their organization to other customers that heavily lean on latest GPT-5 models on Azure. Especially given that you have to do some level of tuning and setup anyways, you might as well invest it in "local"/self-hosted AI (if you can make the financials of the inference cost work out for you).
I think it should also be noted that the inference providers on hyperscalers also tend to be quite fragile, each in their own way (e.g. Google with a horrible rate limit system or Azure with almost weekly intermittent 500-error incidents).
[0]: https://github.com/EratoLab/erato