Another way I'm "going slower" is to have the AI implement individual sub-steps of the current task, and review each one. It's slower than having it yolo out the whole thing, but it's much smaller incremental bits to review, so my brain doesn't glaze over in a huge review, like I had if I had it do the whole task.
I'm following an Ideas -> PRD -> Issues -> Tasks methodology, where each task has a bunch of sub-tasks. I have it just do one (or a few, I'm having it do Red/Green/Refactor as separate sub-steps, so I review the Red case, and then once that's good, do the Green and Refactor steps, and review those).
Instead of using a skill and having the agent own the flow for this, I've been building an external orchestrator that handles the process.
By default it uses pi agent core + pi ai (from the excellent pi coding agent) as a multi model runtime but also supports a Claude Agent SDK runtime.
I can have an implementation and review process of an OpenSpec change run anywhere from 2 hours to 24+ hours going through review/fix/verification rounds automatically until the implementation matches the spec and any additional reviewers are done finding issues after the fix rounds.
it's going to be fully open sourced in the next two weeks and fully free to use
Very much agreed. Something specific that has helped me a lot (beyond just automatic formatting, linting and testing) was putting a hard fail on any file with more than 1500 lines or so, with an allowlist for specific files with specific reasons for their length. I realized the agents were squirreling away code without wanting to do any sort of refactor. Every time one of these rat's nests has turned up, the codebase has been much improved with a small refactor, to the point it doesn't feel like such a pile of slop anymore.
Thank you. That is really important to remind this to people especially in the upper management
Stop being reasonable! This is a hypecycle!
This is the approach I take, with many guardrails and nested CLAUDE.md's to keep things sane.
How profound! Talking points are changing from "vibe coding delivers bug free software" to "slow down and enjoy the AI".
Great how the promoters are mirroring the current anti-AI sentiment. The next step is canceling all subscriptions and not using AI at all. Maybe your mind will work again.
The bug-finding use case alone makes this worth it.
AI makes senior engineers slower in the same way code reviews make teams slower: locally inefficient, globally beneficial.
This. Go slow. Use principles. Argue. Refactor. Improve before you commit. It is the way.
> This is the opposite of the “10x productivity” slop-cannon style of development that most people imagine when they think of vibe coding, but I find it very satisfying.
I can relate to this. When I spend time on writing unit test , even the one which takes 1% of code coverage, it will be honestly wholesome moment for me to ship it confidently.
Are we overcomplicating AI by approaching top down, so naturally there are trillions of variations and too many ways to fail? Supervising a component-level scope, with emphasis on quality control (regression, perf testing, benchmarking, etc), seems to produce great work.
I use cheaper models (Deepseek is king, but GLM and Kimi as well) and do the planning myself. I often start a task myself, write some code to get the LLM on the right track, and then have it complete parts of the implementation that are kind of boring or repetitive. LLM's are just next token predictors, I don't mean that in a demeaning way, but I've found if I can get the LLM started on the right track with my own code, it completes what I want. Having the LLM write code just from a spec ends up with poor quality slop in my experience.
I'm not 100x'ing my output like some people claim, but using it as a augmentation rather than delegating my work to it results in better code, and I don't lose context / control over my codebases. I really have read 100% of the code, because the LLM is generating smaller pieces around and inside my own written code. Works well enough for me, and open models are already both cheap enough and good enough for this workflow. This is why the big companies are so desperate to push full-on agentic hands-off workflows and developer replacement - that's the only way they won't go bankrupt.
Great article and right on point.
Another day, another AI-related thinkpiece
(that people upvote to post their own thinkpieces in the comments)
learn by considering critique
hmmm
> But the thing is, LLMs are very flexible. And you can use them just as effectively to write high-quality code more slowly.
There is a reason it is called slop. On first sight it is often not noticeable but when you dig deeper, you realise that it is often spam-slop. Of course this can be improved upon, but often there is no real improvement and you waste your own time in hope that things get better. Which high quality projects exist that are AI slop generated? Can people name something that is used by many people? The linux kernel? Something in that range? Including documentation? To me it seems people are chasing a dream here: skynet should write the code and they can sit on the beach, enjoying sunshine and fruits.
.
[flagged]
[flagged]
[flagged]
[dead]
[flagged]
[dead]
[flagged]
[flagged]
[flagged]
[flagged]
[dead]
[flagged]
[flagged]
[flagged]
[dead]
[flagged]
[dead]
[flagged]
[flagged]
[dead]
[flagged]
[flagged]
[dead]
[dead]
[dead]
[flagged]
[flagged]
[flagged]
[dead]
To me the blocker with using coding agents is having to rely on a paid external service. Are there any local models that are good enough to be used for coding?