Until a year ago I believed as the author did. Then LLMs got to the point where they sit in meetings like I do, make notes like I do, have a memory like I do, and their context window is expanding.
Only issue I saw after a month of building something complex from scratch with Opus 4.6 is poor adherence to high-level design principles and consistency. This can be solved with expert guardrails, I believe.
It won’t be long before AI employees are going to join daily standup and deliver work alongside the team with other users in the org not even realizing or caring that it’s an AI “staff member”.
It won’t be much longer after that when they will start to tech lead those same teams.
After 2 years of using all of these tools (Claude C, Gemini cli, opencode with all models available) I can tell you it is a huge enabler, but you have to provide these "expert guardrails" by monitoring every single deliverable.
For someone who is able to design an end to end system by themselves these tools offer a big time saving, but they come with dangers too.
Yesterday I had a mid dev in my team proudly present a Web tool he "wrote" in python (to be run on local host) that runs kubectl in the background and presents things like versions of images running in various namespaces etc. It looked very slick, I can already imagine the product managers asking for it to be put on the network.
So what's the problem? For one, no threading whatsoever, no auth, all queries run in a single thread and on and on. A maintenance nightmare waiting to happen. That is a risk of a person that knows something, but not enough building tools by themselves.
I can take a verbal description from a meeting with five to ten people and put together something they can interact with in two weeks. That is a lot slower than Claude Code! Yet everywhere I’ve worked, this is more than fast enough.
Over two more weeks I can work with those same five to ten people (who often disagree or have different goals) and get a first draft of a feature or small, targeted product together. In those latter two weeks, writing code isn’t what takes time; working through what people think they mean verses what they are actually saying, mediating one group of them to another when they disagree (or mostly agree) is the work. And then, after that, we introduce a customer. Along the way I learn to become something of an expert in whatever the thing is and continue to grow the product, handing chunks of responsibility to other developers at which point it turns into a real thing.
I work with AI tooling and leverage AI as part of products, where it makes sense. There are parts of this cycle where it is helpful and time saving, but it certainly can’t replace me. It can speed up coding in the first version but, today, I end up going back and rewriting chunks and, so far, that eats up the wins. The middle bit it clearly can’t do, and even at the end when changes are more directed it tends toward weirdly complicated solutions that aren’t really practical.
> poor adherence to high-level design principles and consistency. This can be solved with expert guardrails, I believe.
That’s a bit… handwavy…!
I've been hearing this for several years. How much longer is "it won't be long"?
The closer you get to releasing software, the less useful LLMs become. They tend to go into loops of 'Fixed it!' without having fixed anything.
In my opinion, attempting to hold the hand of the LLM via prompts in English for the 'last mile' to production ready code runs into the fundamental problem of ambiguity of natural languages.
From my experience, those developers that believe LLMs are good enough for production are either building systems that are not critical (e.g. 80% is correct enough), or they do not have the experience to be able to detect how LLM generated code would fail in production beyond the 'happy path'.