If AI writes code, should the session be part of the commit?

460 points • by mandel_x • yesterday at 12:27 AM • 374 comments • view on HN

Comments

I've been experimenting with a few ways to keep the "historical context" of the codebase relevant to future agent sessions.

First, I tried using simple inline comments, but the agents happily (and silently) removed them, even when prompted not to.

The next attempt was to have a parallel markdown file for every code file. This worked OK, but suffered from a few issues:

1. Understanding context beyond the current session

2. Tracking related files/invocations

3. Cold start problem on an existing codebases

To solve 1 and 3, I built a simple "doc agent" that does a poor man's tree traversal of the codebase, noting any unknowns/TODOs, and running until "done."

To solve 2, I explored using the AST directly, but this made the human aspect of the codebase even less pronounced (not to mention a variety of complex edge-cases), and I found the "doc agent" approach good enough for outlining related files/uses.

To improve the "doc agent" cold start flow, I also added a folder level spec/markdown file, which in retrospect seems obvious.

The main benefit of this system, is that when the agent is working, it not only has to change the source code, but it has to reckon with the explanation/rationale behind said source code. I haven't done any rigorous testing, but in my anecdotal experience, the models make fewer mistakes and cause less regressions overall.

I'm currently toying around with a more formal way to mark something as a human decision vs. an agent decision (i.e. this is very important vs. this was just the path of least resistance), however the current approach seems to work well enough.

If anyone is curious what this looks like, I ran the cold start on OpenAI's Codex repo[0].

[0]https://github.com/jumploops/codex/blob/file-specs/codex-rs/...

natex84 • yesterday at 2:39 AM

If the model in use is managed by a 3rd party, can be updated at will, and also gives different output each time it is interacted with, what is the main benefit?

If I chat with an agent and give an initial prompt, and it gets "aspect A" (some arbitrary aspect of the expected code) wrong, I'll iterate to get "aspect A" corrected. Other aspects of the output may have exactly matched my (potentially unstated) expectation.

If I feed the initial prompt into the agent at some later date, should I expect exactly "aspect A" to be incorrect again? It seems more likely the result will be different, maybe with some other aspects being "unexpected". Maybe these new problems weren't even discussed in the initial archived chat log, since at that time they happened to be generated in a way in alignment with the original engineers expectation.

➕ show 2 replies

daxfohl • yesterday at 3:01 AM

I think so. If nothing else, when you deploy and see a bug, you can have a script that revives the LLMs of the last N commits and ask "would your change have caused this?" Probably wouldn't work or be any more efficient than a new debugging agent most of the time, but it might sometimes and you'd have a fix PR ready before you even answered the pager, and a postmortem that includes WHY it did so, and a prompt to prevent that behavior in the future. And it's cheap, so why not.

Maybe not a permanent part of the commit, but something stored on the side for a few weeks at a time. Or even permanently, it could be useful to go back and ask, "why did you do it that way?", and realize that the reason is no longer relevant and you can simplify the design without worrying you're breaking something.

otar • yesterday at 4:45 AM

In the ideal world a specification file should be committed to the repository and then linked to the PR/commit. But it slows you down and is no longer a vibe coding?

Soon only implementation details will matter. Code can be generated based on those specifications again and again.

➕ show 1 reply

crossroadsguy • yesterday at 4:41 AM

Goodness no! Sometimes I literally SHOUT at these agents/chats and often stoop down to using cuss words, which I am not proud of, but surprisingly it has shown to work here and there. As real as that is, I'd not want that on record in a commit.

root_axis • yesterday at 4:11 AM

This seems wrong, like committing debug logs to the repo. There's also lots of research showing that models regularly produce incorrect trace tokens even with a correct solution, so there's questionable value even from a debugging perspective.

burntoutgray • yesterday at 2:08 AM

YES! The session becomes the source code.

Back in the dark ages, you'd "cc -s hello.c" to check the assembler source. With time we stopped doing that and hello.c became the originating artefact. On the same basis the session becomes the originating artefact.

➕ show 4 replies

robseed • yesterday at 5:54 PM

Unedited AI generated code should have a different blame line than regular code, something like author_ai vs author.

semiinfinitely • yesterday at 3:43 PM

Should your browser and search history be part of the commit too?

➕ show 1 reply

fladrif • yesterday at 5:09 AM

I think this is a lot of "kicking can down the road" of not understanding what code the ai is writing. Once you give up understanding the code that is written there is no going back. You can add all the helper commit messages, architecture designs, plans, but then you introduce the problem of having to read all of those once you run into an issue. We've left readability on the wayside to the alter of "writeability".

The paradigm shift, which is a shift back, is to embrace the fact that you have to slow down, and understand all the code the ai is writing.

bloomca • yesterday at 7:16 AM

I don't think it's worth to include the session -- it would bloat the context too much anyway.

However, I do think that a higher-level description of every notable feature should be documented, along with the general implementation details. I use this approach for my side projects and it works fairly well.

The biggest question whether it will scale, I suspect that no, and I also suspect it is probably better to include nothing than a poor/disjointed/rare documentation of the sessions.

eddyg • yesterday at 1:21 PM

https://specstory.com/specstory-cli is another tool in this space (it writes clean Markdown session files into the project for future reference)

visarga • yesterday at 5:19 AM

Yes, it should remain part of the commit, and the work plan too, including judgements/reviews done with other agents. The chat log encodes user intent in raw form, which justifies tasks which in turn justify the code and its tests. Bottom up we say the tests satisfy the code, which satisfies the plan and finally the user intent. You can do the "satisfied/justified" game across the stack.

I only log my own user messages not AI responses in a chat_log.md file, which is created by user message hook in the repo.

willbeddow • yesterday at 6:04 AM

Increasingly, I'd like the code to live alongside a journal and research log. My workflow right now is spending most of my time in Obsidian writing design docs for features, and then manually managing claude sessions that I paste them back and forth into. I have a page in obsidian for each ongoing session, and I record my prompts, forked paths, thoughts on future directions, etc. It seems natural that at some point this (code, journal, LLM context) will all be unified.

jollymonATX • yesterday at 2:26 PM

How verbose a history is even plausible to store and recall in modern git? This could add decent pressure on those mechanisms and the usability, for humans at least, would be taxing to consume.

rhgraysonii • yesterday at 6:15 AM

I think the decisions it made along the way are worth tracking. And it’s got some useful side effects with regard to actually going through the programming and architecture process. I made a tool that really helps with this and finds a pretty portable middle ground that can be used by one person or a team too, it’s flexible. https://deciduous.dev/

reflectt • yesterday at 5:43 AM

The session capture problem is harder than it looks because you need to capture intent, not steps.

A coding session has a lot of 'left turn, dead end, backtrack' noise that buries the decision that actually mattered. Committing the full session is like committing compiler output — technically complete, practically unreadable.

We've been experimenting with structured post-task reflections instead: after completing significant work, capture what you tried, what failed, what you'd do differently, and the actual decision reasoning. A few hundred tokens instead of tens of thousands. Commits with a reflection pointer rather than an embedded session.

The result is more useful than raw logs. Future engineers (or future AI sessions) can understand intent without replaying the whole conversation. It's closer to how good commit messages work — not 'here's what changed' but 'here's why'.

Dang's point about there being no single session is also real. Our biggest tasks span multiple sessions and multiple contributors. 'Capture the session' doesn't compose. 'Capture the decision' does.

➕ show 1 reply

PeterStuer • yesterday at 10:19 AM

The session might contain many artifacts that are not suited for open sourcing. The additional fine grained curation effort required might be more of an obstacle to open sourcing than the perceived benefits.

That said preserved private session records might be of great personal benefit.

saratogacx • yesterday at 4:15 AM

I've gotten into the habit of having the LLM produce a description of their process and summarize the change, Than I add that along with the model I used after my own commit message. It lets me know where I use AI and what I thought it did as well as what I thought it did.

The entire prompt and process would be fine if my git history was subject to research but really it is a tool for me or anyone else who wants to know what happened at a given time.

voidUpdate • yesterday at 11:36 AM

People keep talking about how LLMs are like a compiler from human language to code. We commit source code instead of just compiled machine code, so why should this be any different? The "source code" is the prompts

➕ show 1 reply

zkmon • yesterday at 9:32 AM

Source code repositories such as git are for "sources" which are direct outputs of human effort. Sny generated stuff is not "source". It is same as the outputs of compile and build activities. Only the direct outputs of human effort should go into git.

kkarpkkarp • yesterday at 6:12 AM

For my own projects in private repos I would benefit from exporting the session. For example if I need to return to the task, it could be great to give it as a context

For my work as one of developers in team, no. The way I prompt is my asset and advantage over others in a team who always complain about AI not being able to provide correct solutions and secures my career

ajam1507 • yesterday at 1:08 PM

Yes, please, it would solve the problem of the relentless HN discussions about how useful AI is for coding. We could actually see how productive people are using it.

➕ show 1 reply

jillesvangurp • yesterday at 8:10 AM

I think that's covered by the YAGNI rule. It has very little value that rapidly drops off as you commit more code. Maybe some types of software you might want to store some stuff for compliance/auditing reasons. But beyond that, I don't see what you would use it for.

phyzix5761 • yesterday at 6:14 AM

Have AI explain the reasoning behind the PR. I don't think people really care about your step by step process but reviewers might care about your approach, design choices, caveats, and trade offs.

That context could clarify the problem, why the solution was chosen, key assumptions, potential risks, and future work.

akoskomuves • yesterday at 11:18 AM

I've done something similar with full analytics and options to add the full team. https://getpromptly.xyz

ryan_velazquez • yesterday at 10:45 AM

If the agent is like a compiler, show me the source code.

I'm not sure about becoming part of the repo/project long term but I think providing your prompts as part of the pull request makes the review much easier because the reviewer can quickly understand your _intent_. If your intent has faulty assumptions or if the review disagrees with the intent, that should be addressed first. If the intent looks good, a reviewer can then determine if you (or your coding agent) have actually implemented it first.

segmondy • yesterday at 6:21 AM

It's already bad enough that people are saying there's too much code to read and review. You want to add session to it? Running it again, might not yield the same output. These models are non deterministic and models are often changed and upgraded.

➕ show 1 reply

galaxyLogic • yesterday at 9:31 AM

Couldn't AI write the commit-message based on the prompts-history up till the commit thus making it easier to understand for any future reviewers what lead to and what is in a specific commit?

rcy • yesterday at 2:56 AM

I haven't adopted this yet, but have a feeling that something like this is the right level of recording the llm contribution / session https://blog.bryanl.dev/posts/change-intent-records

➕ show 1 reply

Marlinski • yesterday at 10:45 AM

If there was a standardized way to save this information, and tie it up to each commits, it would be insanely useful to amass a very valuable training dataset.

mixdup • yesterday at 3:56 PM

LLMs are non-deterministic, so feeding that session back in possibly will get you a different output. Also, models change over time so you may not necessarily be able to run the session against the same model again

The whole point of the source code it generates is to have the artifact. Maybe this is somewhat useful if you need to train people how to use AI, but at the end of the day the generated code is the thing that matters. If you keep other notes/documentation from meetings and design sessions, however you keep that is probably where this should go, too?

➕ show 1 reply

travisgriggs • yesterday at 2:40 AM

In our (small) team, we’ve taken to documenting/disclosing what part(s) of the process an LLM tool played in the proposed changes. We’ve all agreed that we like this better, both as submitters and reviewers. And though we’ve discussed why, none of us has coined exactly WHY we like this model better.

heavyset_go • yesterday at 5:46 AM

If you need LLM sessions included to understand or explain commits, you're doing something wrong.

Saving sessions is even more pointless without the full context the LLM uses that is hidden from the user. That's too noisy.

genghisjahn • yesterday at 3:55 AM

If you can, run several agents. They document their process. Trade offs considered, reasoning. Etc. it’s not a full log of the session but a reasonable history of how the code came to be. Commit it with the code. Namespace it however you want.

wiseowise • yesterday at 8:44 AM

No, because if AI is set to replace a human – their prompting skill and approach are the only things differentiating them from the rest of the grey mass.

tezza • yesterday at 7:25 AM

I put a link to the LLM session at the end of the commit, and prefix with POH: if I wrote it by hand.

POH = Plain Old Human

Easy to achieve.

Why NOT include a link back? Why deprive yourself of information?

atmosx • yesterday at 10:13 AM

It is a useful piece of information, but the session is not “long lived” in terms of git commit history lifetime.

toddmorrow • yesterday at 11:08 PM

yep. but I don't know what folder. maybe under logs. it's really a new category

DonThomasitos • yesterday at 6:38 AM

Everything in git can and must be merge-able when merging branches. After all, git is a collaboration tool, not a undo-redo stack.

ChicagoDave • yesterday at 8:05 AM

The last 5 sessions. Beyond that I archive them outside the repo. But I do save them for review and summaries.

danhergir • yesterday at 12:49 AM

One of the use cases i see for this tool is helping companies to understand the output coming from the llm blackbox and the process which the employee took to complete a certain task

➕ show 2 replies

dolebirchwood • yesterday at 8:02 AM

I drop a lot of F-bombs and other unpleasantries when I talk to the robots, so I'd rather not.

darepublic • yesterday at 5:18 AM

If a human writes code, should the jira ticket be part of the commit? I am actually thinking about potential merits.

stopthe • yesterday at 10:39 AM

No. Even further than that, maintaining AGENTS.md and the like in your company repo, you basically train your own replacement. Which replacement will not be as capable as you in the long run, but few businesses will care. Anyway having some representation of an employee's thinking definitely lowers cost of firing that employee.

That is a cynical take and not very different from an advice to never write any documentation, or never help your teammates. Only that resemblance is superficial. In any organization you shouldn't help people stealing you time for their benefit (Sean Goedecke calls them predators https://www.seangoedecke.com/predators/).

On the other hand, it may be beneficial to privately save CLAUDE.md and other parts of persistent context. You may gitignore them (but that will be conspicuous unless you also gitignore .gitignore) or just load them from ~/.claude

I expect an enterprise version of Claude Code that will save any human input to the org servers for later use.

ekjhgkejhgk • yesterday at 8:48 AM

If a person writes code, should all the process be part of the commit?

grahar64 • yesterday at 7:00 AM

If AI could reliably write good code then you shouldn't need to even commit the code as the general rule is you shouldn't commit generated code. Commit the session when you don't need to commit the code

stubbi • yesterday at 4:15 AM

Isn’t that what entire.io, founded by former GitHub CEO, is doing?

jiveturkey • yesterday at 2:38 AM

https://entire.io thinks so

➕ show 1 reply

hirako2000 • yesterday at 5:29 AM

What's the value given answers are not deterministic.

alt Hacker News

If AI writes code, should the session be part of the commit?

Comments

🔗 View 35 more comments