A wise man from Google said in an internal memo to the tune of: "We do not have any moat neither does anyone else."
Deepseek v4 is good enough, really really good given the price it is offered at.
PS: Just to be clear - even the most expensive AI models are unreliable, would make stupid mistakes and their code output MUST be reviewed carefully so Deepseek v4 is not any different either, it too is just a random token generator based on token frequency distributions with no real thought process like all other models such as Claude Opus etc.
> just a random token generator based on token frequency distributions with no real thought process
I'm not smart enough to reduce LLMs and the entire ai effort into such simple terms but I am smart enough to see the emergence of a new kind of intelligence even when it threatens the very foundations of the industry that I work for.
Deepseek v4, Qwen 3.6 Plus/Max, GLM 5+ are all pretty solid for most work.
I agree. Data and userbase are still the moats.
Once a new model or a technique is invented, it’s just a matter of time until it becomes a free importable library.
I went and tried to debug a script. Asked deepseek 4 pro and Claude the same prompt, they both took the exact same decisions, which led to the exact same issue and me telling them its still not working, with context, over a dozen time.
Over a dozen time they just gave both the same answer, not word for word, but the exact same reasoning.
The difference is that deepseek did on 1/40th of the price (api).
To be honest deepseek V4 pro is 75% off currently, but still were speaking of something like 3$ vs 20$.
Fully agree, I only pay the minimum for frontier models to get DeepSeek v4 output reviewed. I don't see this changing either because we have reached a level of good enough at this point.
> Deepseek v4 is good enough, really really good given the price it is offered at.
Do they have monthly subscriptions, or are they restricted to paying just per token? It seems to be the latter for now: https://api-docs.deepseek.com/quick_start/pricing/
Really good prices admittedly, but having predictable subscriptions is nice too!
Can Deepseek answer probing questions about Winnie the Pooh?
PS: Just to be clear - even the most expensive humans are unreliable, would make stupid mistakes, and their output MUST be reviewed carefully, so you’re not any different either. You’re just a random next-thought generator based on neuron firing distributions with no real thought process, trained on a few billion years of evolution like all other humans.
dont they have the moat of being able to test their models on billions of ppl and gather feedback.
This is just starting to feel like desperation, making this claim that SOC LLMs are random token generators with absolutely no possibility of anything above that. Keep shouting into the wind though.
"Deepseek v4 is good enough, really really good given the price it is offered at."
Kimi, MiMo, and GLM 5.1 all score higher and are cheaper.
They all came out before DeepSeek v4. I think you're pattern-matching on last year's discourse.
(I haven't seen other replies, yet, but I assume they explain the PS that amounts to "quality doesn't matter anyway": which still doesn't address the fact it's more expensive and worse.)
We can't rule out a new innovation that makes frontier models more relevant than deepseek in 6 months. Things evolve so fast.
>[LLMs are just] random token generator based on token frequency distributions with no real thought
... and who knows if we, humans, are not just merely that.
What a crock of bs. A brain is "just" electrochemistry and a novel is "just" arrangements of letters. The question isn't the substrate, it's what structure emerges on top of it. Anthropic's own interpretability work has surfaced internal features that look like learned concepts, planning, and something resembling goal-directed reasoning. Calling the outputs random is wrong in a specific way, the distribution is extraordinarily structured.
AI will never.... Until it does.
I don’t think LLMs are that great at creating, however improved they have; I need to stay in the driver seat and really understand what’s happening. There’s not that much leverage in eliminating typing.
However, for reviewing, I want the most intelligent model I can get. I want it to really think the shit out of my changes.
I’ve just spent two weeks debugging what turned out to be a bad SQLite query plan (missing a reliable repro). Not one of the many agents, or GPT-Pro thought to check this. I guess SQL query planner issues are a hole in their reviewing training data. Maybe Mythos will check such things.