logoalt Hacker News

jedbergyesterday at 6:44 AM21 repliesview on HN

The way I write code with AI is that I start with a project.md file, where I describe what I want done. I then ask it to make a plan.md file from that project.md to describe the changes it will make (or what it will create if Greenfield).

I then iterate on that plan.md with the AI until it's what I want. I then ask it to make a detailed todo list from the plan.md and attach it to the end of plan.md.

Once I'm fully satisfied, I tell it to execute the todo list at the end of the plan.md, and don't do anything else, don't ask me any questions, and work until it's complete.

I then commit the project.md and plan.md along with the code.

So my back and forth on getting the plan.md correct isn't in the logs, but that is much like intermediate commits before a merge/squash. The plan.md is basically the artifact an AI or another engineer can use to figure out what happened and repeat the process.

The main reason I do this is so that when the models get a lot better in a year, I can go back and ask them to modify plan.md based on project.md and the existing code, on the assumption it might find it's own mistakes.


Replies

jumploopsyesterday at 8:03 AM

I do something similar, but across three doc types: design, plan, and debug

Design works similar to your project.md file, but on a per feature request. I also explicitly ask it to outline open questions/unknowns.

Once the design doc (i.e. design/[feature].md) has been sufficiently iterated on, we move to the plan doc(s).

The plan docs are structured like `plan/[feature]/phase-N-[description].md`

From here, the agent iterates until the plan is "done" only stopping if it encounters some build/install/run limitation.

At this point, I either jump back to new design/plan files, or dive into the debug flow. Similar to the plan prompting, debug is instructed to review the current implementation, and outline N-M hypotheses for what could be wrong.

We review these hypotheses, sometimes iterate, and then tackle them one by one.

An important note for debug flows, similar to manual debugging, it's often better to have the agent instrument logging/traces/etc. to confirm a hypothesis, before moving directly to a fix.

Using this method has led to a 100% vibe-coded success rate both on greenfield and legacy projects.

Note: my main complaint is the sheer number of markdown files over time, but I haven't gotten around to (or needed to) automate this yet, as sometimes these historic planning/debug files are useful for future changes.

show 4 replies
frank00001yesterday at 7:24 AM

Sounds like the spec driven approach. You should take a look at this https://github.com/github/spec-kit

show 5 replies
dmdyesterday at 12:36 PM

https://github.com/obra/superpowers "brainstorming" is pretty much exactly this workflow, and it's great.

giancarlostoroyesterday at 7:44 PM

This is how I used to use Beads before I made GuardRails[0]. I basically iterate with the model, ask it to do market research, review everything it suggests, and you wind up with a "prompt" that tells it what to do and how to work that was designed by the model using its own known verbiage. Having learned about how XML could be used to influence Claude I'm rethinking my flow and how GuardRails behaves.

[0]: https://giancarlostoro.com/introducing-guardrails-a-new-codi...

RHSeegeryesterday at 3:20 PM

I do something similar - A full work description in markdown (including pointers to tickets, etc); but not in a file - A "context" markdown file that I have it create once the plan is complete... that contains "everything important that it would need to regenerate the plan" - A "plan" markdown file that I have it create once the plan is complete

The "context" file is because, sometimes, it turns out the plan was totally wrong and I want to purge the changes locally and start over; discussing what was done wrong with it; it gives a good starting point. That being said, since I came up with the idea for this (from an experience it would have been useful and I did not have it) I haven't had an experience where I needed it. So I don't know how useful it really is.

None of that ^ goes into the repo though; mostly because I don't have a good place to put it. I like the idea though, so I may discuss it with my team. I don't like the idea of hundreds of such files winding up in the main branch, so I'm not sure what the right approach is. Thank you for the idea to look into it, though.

Edit: If you don't mind going into it, where do you put the task-specific md files into your repo, presumably in a way that doesn't stack of over time and cause ... noise?

nesarkvechnepyesterday at 5:20 PM

By that time you would’ve written the code yourself, only better.

show 1 reply
anbendeyesterday at 2:20 PM

Here’s how I do the same thing, just with a slightly different wrapper: I’m running my own stepwise runtime where agents are plugged into defined slots.

I’ll usually work out the big decisions in a chat pane (sometimes a couple panes) until I’ve got a solid foundation: general guidelines, contracts, schemas, and a deterministic spec that’s clear enough to execute without interpretation.

From there, the runtime runs a job. My current code-gen flow looks like this: 1. Sync the current build map + policies into CLAUDE|COPILOT.md 2. Create a fresh feature branch 3. Run an agent in “dangerous mode,” but restricted to that branch (and explicitly no git commands) 4. Run the same agent again—or a different one—another 1–2 times to catch drift, mistakes, or missed edge cases 5. Finish with a run report (a simple model pass over the spec + the patch) and keep all intermediate outputs inspectable

And at the end, I include a final step that says: “Inspect the whole run and suggest improvements to COPILOT.md or the spec runner package.” That recommendation shows up in the report, so the system gets a little better each iteration instead of just producing code.

I keep tweaking the spec format, agent.md instructions and job steps so my velocity improves over time.

--- To answer the original article's question. I keep all the run records including the llm reasoning and output in the run record in a separate store, but it could be in repo also. I just have too many repos and want it all in one place.

show 1 reply
8noteyesterday at 7:42 PM

the real question is when peer feedback and review happens.

is making the project file collaborative between multiple engineers? the plan file?

ive tried some variants of sharing different parts but it feels like ots almost water effort if the LLM then still goes through multiple iterations to get whats right, the oroginal plan and project gets lost a bit against the details of what happened in the resulting chat

shinycodeyesterday at 7:58 AM

I also do that and it works quite well to iterate on spec md files first. When every step is detailed and clear and all md files linked to a master plan that Claude code reads and updates at every step it helps a lot to keep it on guard rails. Claude code only works well on small increments because context switching makes it mix and invent stuff. So working by increments makes it really easy to commit a clean session and I ask it to give me the next prompt from the specs before I clear context. It always go sideways at some point but having a nice structure helps even myself to do clean reviews and avoid 2h sessions that I have to throw away. Really easier to adjust only what’s wrong at each step. It works surprisingly well

ryanmclyesterday at 2:53 PM

This is fascinating and I wish I'd started with something like this from day one.

I'm 8 months into my first app as a self-taught developer and my biggest regret is having no artifact trail. I can describe what every piece of my app does, but if you asked me WHY I made specific architectural decisions, I'd struggle. Those conversations happened in Claude chat windows that are long gone.

The plan.md approach solves something I didn't realize was a problem until it was/whom [your future self] (or your future model) needs to understand not just what was built but what was considered and rejected. I've lost count of the times I've asked Claude "why does this work this way?" about my own code and neither of us could remember.

Starting a project.md for every feature going forward. Better late than never.

winwangyesterday at 4:26 PM

Interesting! I actually split up larger goals into two plan files: one detailed plan for design, and one "exec plan" which is effectively a build graph but the nodes are individual agents and what they should do. I throw the two-plan-file thing into a protocol md file along with a code/review loop.

adam_patarinoyesterday at 1:44 PM

You check the plan files into git? Don’t you end up with dozens of md files?

I’ve been copying and pasting the plan into the linear issue or PR to save it, but keep my codebase clean.

show 1 reply
odirootyesterday at 4:59 PM

How do you use your agent effectively for executing such projects in bigger brownfield codebases? It's always a balance between the agent going way too far into NIH vs burning loads and loads of tokens for the initial introspection.

the-grumpyesterday at 7:19 AM

Stealing this brilliant idea. Thank you for sharing!

show 2 replies
tlbyesterday at 12:24 PM

Do you clear the file and use the same name for the next commit? Or create a new directory with a plan.md for each set of changes?

vorticalboxyesterday at 11:46 AM

you may like openspec[0]

[0] https://openspec.dev/

fhubyesterday at 8:29 AM

I do something similar but I get Claude to review Codex every step of the way and feed it back (or visa versa depending on day)

show 1 reply
iainmck29yesterday at 12:12 PM

is this not what entire.io is doing? Was founded by the old Github CEO Thomas Dohmke

show 1 reply
Bombthecatyesterday at 12:51 PM

Then you might like to look into automaker.

matkonieczyesterday at 12:08 PM

I do the same, but put it as a comment on top of generated file.

(So far I have not used LLMs to generate code larger than fitting in one file.)

Overall idea is that I modify and tweak prompt, and keep starting new LLM sessions and dispose of old ones.

stackghostyesterday at 6:49 AM

>I then iterate on that plan.md with the AI until it's what I want.

Which tools/interface are you using for this? Opencode/claude code? Gas town?

show 3 replies