It's good that there is a movement for open LLMs, but it's not where the battleground is right now. The battleground is local vs service LLMs, and we are losing that battle badly despite all the software being here now and viable, entirely because UX sucks.
How many normal people do you know who use "ChatGPT"? A lot, probably.
How many even know what "Gemma" is, let alone have downloaded llama.cpp, a GGUF file from Hugginface, and run "llama-server" from a text console with all the correct command arguments? How many are thinking about this use case when speccing out their next computer? Where is the breathless marketing copy boasting x tok/s?
We are sleepwalking into slavery.
Normal people can go open an account at DeepSeek or Xiaomi and chat away for free. Or, for that matter, a couple other models like z.ai's (GLM-5.2 isn't in the free tier, though, but neither is GPT-5.5-Pro), or Qwen, which does have 3.7-Max for free with no account on their chatbot interface.
Yes, I realise this isn't "running a local model", but it's using models that can be grabbed and run locally. For my pipelines, I feel far more confidence when I use an open model (even one like GLM-5.2 that would be expensive for me to run) since I have a backup plan if the hosted/cloud option becomes unworkable for me. If that happens to me with Opus, I have zero options.
If our strategy to avoid "slavery" involves "normal people" taking the local-vs-managed choice seriously, we have already lost.
This choice is made for us. The deciding factors will be convenience and economics.
My sense is that just like Web 2.0 SaaS we are destined for servitude.
A better strategy is to play an assymetrical game IMO. Don't let your would-be master write the rules by which you play.
normal people dont really have the hardware to run local models
Google Edge Gallery is turn key for people and on the device most people chatgpt on. Just like with most Google Stuff “edge gallery” is maybe the worst name possible for “run AI on your phone”!
Why do you feel the important part _now_ is where the weights get run?
I can see this as a future battleground but access to frontier models (which you cannot run locally) seems a lot more relevant today.
You can’t run a closed llm locally. Strange to frame the dichotomy as between local and open. One begets the other.
Better UX does not buy you a datacenter farm to train state of the art cutting edge models. Right now the only people who can do that are the technobility class.
> We are sleepwalking into slavery.
That’s a bit hyperbolic…
it's funny because i made this thing (called enough) that aims to make it easy for non-technical people to get up and running with local models quickly, but it is impossible to figure out how to break through the noise. every thread and comment like this breaks my heart a lil bit
Yea, anyone who understands what makes products actually usable is opting to get paid for said skill.
> we are losing that battle badly despite all the software being here now and viable, entirely because UX sucks.
Yep. I'm an old time Linux sysadmin, but I am COMPLETELY baffled as to what I can or cannot run on my 32GB R9700 with 128GB main CPU memory.
If I want something Claude or Codex like what do I use that would be useful? If I want a chat system, what do I use? Images--apparently ComfyUI for setup but after that what do I do?
I don't even mind spinning up something in the cloud for a bit, but I need to know how I'm going to get data up and down without racking up massive bandwidth charges.
I'd love to do some tinkering, but the field is moving so fast and so full of charlatans that cleaning the dross out is almost impossible.
LM Studio
"Normal people" have never bothered to host their own: photos, music, videos, documents, comunications, etc. To the point that for many their computer is essentially a thin client into someone else's server. Why would we think this same people would care about "personal" inference?