logoalt Hacker News

gamegoblinyesterday at 6:50 PM3 repliesview on HN

I routinely leave codex running for a few hours overnight to debug stuff

If you have a deterministic unit test that can reproduce the bug through your app front door, but you have no idea how the bug is actually happening, having a coding agent just grind through the slog of sticking debug prints everywhere, testing hypotheses, etc — it's an ideal usecase


Replies

nikkwongyesterday at 7:19 PM

I have a hard time understanding how that would work — for me, I typically interface with coding agents through cursor. The flow is like this: ask it something -> it works for a min or two -> I have to verify and fix by asking it again; etc. until we're at a happy place with the code. How do you get it to stop from going down a bad path and never pulling itself out of it?

The important role for me, as a SWE, in the process, is verify that the code does what we actually want it to do. If you remove yourself from the process by letting it run on its own overnight, how does it know it's doing what you actually want it to do?

Or is it more like with your usecase—you can say "here's a failing test—do whatever you can to fix it and don't stop until you do". I could see that limited case working.

show 5 replies
tsssyesterday at 7:01 PM

How can you afford that?

show 1 reply
addaonyesterday at 7:23 PM

> it's an ideal usecase

This is impressive, you’ve completely mitigated the risk of learning or understanding.

show 1 reply