logoalt Hacker News

manmalyesterday at 7:47 PM1 replyview on HN

I don’t think LLMs are that great at creating, however improved they have; I need to stay in the driver seat and really understand what’s happening. There’s not that much leverage in eliminating typing.

However, for reviewing, I want the most intelligent model I can get. I want it to really think the shit out of my changes.

I’ve just spent two weeks debugging what turned out to be a bad SQLite query plan (missing a reliable repro). Not one of the many agents, or GPT-Pro thought to check this. I guess SQL query planner issues are a hole in their reviewing training data. Maybe Mythos will check such things.


Replies

TheFirstNubianyesterday at 8:38 PM

I’m a little conflicted on this, as I see a slippery slope here. LLMs in their current state (e.g., Opus-4.7) are really good in planning and one-shot codegen, which I believe is their primary use case. So they do provide enough leverage in that regard.

With this new workflow, however, we should, uncompromisingly, steer the entire code review process. The danger here, the “slippery slope,” is that we’re constantly craving for more intelligent models so we can somehow outsource the review to them as well. We may be subconsciously engineering ourselves into obsolescence.

show 2 replies