When I began reviewing my teammate’s PRs with AI generated code in it, something started to feel wei...

rDr4g0n • yesterday at 1:52 PM • 0 replies • view on HN

When I began reviewing my teammate’s PRs with AI generated code in it, something started to feel weird. It took a bit, but I realized the problem: I am not reviewing the work my teammate did.

What are they even supposed to do with feedback on the code? It has to be translated by my teammate into the language of the work they did, which is the conversation they had with the AI agent.

But the conversation isn't the "real work": the decisions made in the conversation are the real work. That is what needs capture and review.

So now I know why code reviews are kinda wrong, what can we do to have meaningful reviews of the work my teammates have done?

What I landed on is aiming to capture more and more “work” in the form of a spec, review the spec, ignore the code. this isn't novel or interesting. HOWEVER...

For the large, messy, legacy codebases I work in today, I don’t like the giant spec driven development approach that is most popular today. It’s too risky to solely trust the spec because it touches so much messy code with so many gotchas. However, with the rate of AI generated code rolling in, I simply can’t switch context quickly enough to review it all efficiently. Also, it’s exhausting.

The approach I have been refining is defining very small modules (think a class or meaningful collection of utils) with a spec and a concise set of unit tests, generating code from the spec, then not reading or editing the generated code.

Any changes to the code must be made to the spec, and the code re-generated. This puts the PR conversation in the right place, against the work I have done: which is write the spec.

So far the approach has worked for replacing simple code (eg: a nestjs service that has a handful of public methods, a bit of business logic, and a few API client calls). PRs usually have a handful of lines of glue code to review, but the rest are specs (and a selection of “trust” unit tests) and the idea is that the code can be skipped.

AI review bots still review the PR and comment around code quality and potential security concerns, which I then translate into updates to the spec.

I find this to be a good step towards the codegen future without totally handing over my (very messy and not very agent friendly) codebases.

alt Hacker News