Author here, if you don't want to read all that, I'll post one excerpt that I think sums it up nicely:
> My point is, the spec must live somewhere, even if you don’t write it down. The spec is what you want the software to be. It often exists only in your head or in conversations. You and your team and your business will always care what the spec says, and that’s never going to change. So you’re better off writing it down now! And I think that a plain old list of acceptance criteria is a good place to start. (That’s really all that `feature.yaml` is.)
I independently converged on something similar. I use two to three specification docs for my c++ work: a firmware manual (describes features and interfaces)) , an implementation plan (order of implementation, mechanisms where specified - new features go in here) and a product manual ( user story, external effects) I start with a user story, build an implementation plan, write the code, write the firmware manual, check the 3 documents +code for consistency and coherence. Either change the code or the documentation to reflect a coherent unified truth. (Implementation plan gradually becomes as-built) I also have the code comprehensively commented so that it is difficult to misinterpret. “Correct, coherent, consistent, commented”
We iterate feature by feature through this process, and occasionally circle back on the original product manual to identify drift.
After the original documentation is drafted, I have the agent write up placeholder files and define all of the interfaces we expect to need (we will end up adding a lot later, but that’s ok) every file should reflect a clear separation of concerns, and can only be reached into through its defined interface, all else is private. I end up with more individual files than I would by hand, but by constraining scope at file granularity, and defining an inviolate interface per file, I avoid the LLM tendency to take shortcuts that create unmaintainable code.
I also open each new context with an onboarding process that briefly describes the logos and the ethos of the project, why the agent should be deeply invested in the success of the project, as well as learnings.md which the agent writes as it comes across notable gotchas or strong preferences of mine.
Needless to say, I use one million context , and it’s a token fire… but the results are solid and my productivity is 5-10x
I wrote something similar recently about how agent-generated code lacks the institutional memory that human-written code has. There's nobody to ask why a decision was made (1).
“Specsmaxxing” is basically the right response to this. When you can't rely on authorial memory, you have to put the intent somewhere durable. Specs become the source of truth by default if we continue down the road of AI generated code.
1: https://ossature.dev/blog/ai-generated-code-has-no-author/
This ultimately converges on what source code is though.
The most common form of what you'd call a "spec" is the acceptance criteria on a work ticket, which is an accretive spec i.e. a description of desired change -- "given what already exists, change it as follows". I.e. if you somehow layered and summarized and condensed all tickets that have been made since product started, you'd have your "spec".
But it's the devs who were doing that condensing via understanding each desired spec addition vs reality of existing codebase.
So the gap between what people are currently calling "specs" what the code was already doing is not big and will not stay big, but for the fact you're effectively adding another (quasi) compile step underneath - and in this case its a non-deterministic one.
> will always care what the spec says, and that’s never going to change
Did I miss something or is everyone back in 1970s, working in waterfall processes now?
What's the difference between this and Jira. Your specs already live somewhere, it's where you defined them. That's why it's nice to put the Jira ticket number in your code / commit, so you can refer back to the spec when something breaks
Nice! Your spec-maxxing is very resonant. I've been doing working with explicit requirements: elicit them from conversation with me or introspecting another piece of software; one-shot from them; and keep them up-to-date as I do the "old man shouts at Claude" iterations after whatever one-shotting came up with.
Unlike you, I wish for the LLM to do as much of the work as possible -- but "as possible" is doing a lot of work in that sentence. I'm still trying to get clear on exactly where I am needed and where Opus and iterations will get there eventually.
It has really challenged me to get clearer on what a requirement is vs a constraint (e.g., "you don't get to reinvent the database schema, we're building part of a larger system"). And I still battle with when and how to specify UI behaviours: so much UI is implicit, and it seems quite daunting to have to specify so much to get it working. I have new respect for whoever wrote the undoubtedly bajillion tests for Flutter and other UI toolkits.
So what I'm building is a github clone with epics/issues/kanban + specs/requirements/standards + CI/testing/coverage with the idea that all of those things connect so issues+requirements+testing all interact via code+webUI+CLI the point being that we can specify how a product is to function and the steps to get there and it's less a matter of telling a person or an LLM to read and implement the spec and more software actually keeping track at all times.
I actually read it all since it did not contain any hints of being AI generated (although I wouldn't be surprised to learn you did use AI to write it), so thank you for that. It's kind of crazy how I now have the default expectation that posts posted here are AI slop with little thought or care put in.
I am also stealing the idea of talking to LLMs as if it's an email. So funny, we need to be joymaxxing a bit more I think :)
Great idea -- just one suggestion if you want it to catch on: perform some IncelCultureMinning on the nomenclature.
You probably don't want people associating your work with abusing crystal meth and hitting yourself in the face with a hammer.
For anyone missing the reference, SNL has a pretty good explainer:
The traditional name for this spec is ‘source code’ — a canonical source of truth for the behaviour of a system that is as human-readable as we know how to make it, that will be processed by automated tools into a less-readable derived artefact for a computer to execute.
Checking the compiled artefact into the codebase without checking in its source code has always been a risky move!