logoalt Hacker News

Majromaxtoday at 4:54 PM1 replyview on HN

> They are all just variations of "insert a canned prompt", varying only along the dimensions of (a) how and where the prompt is installed and from where it is sourced, and (b) which context or contexts the prompt runs in. There's not much advice here about which option is best, and no clear best practices seem to have emerged yet either. Personally, I find just asking Claude to review the code works well enough.

The subagent approach is structurally different from the others because it runs with clean context. That has three major effects:

1. All other things being equal, it will result in a lower cost-to-solution because of the quadratic cost scaling of an LLM session (input token or cached-input cost being paid with each new round).

2. The review model will not be able to 'cheat' by retaining assumptions from the main session, such as "x must be done like y." For people, this is why having a separate person perform code review (or, if not possible, reviewing code after a mind-clearing break) is handy; the applicability of this analogy to LLMs is vague but reasonable.

3. The main model will only see the results of the review, not the detailed reasoning that leads up to it. On one hand this avoids more context pollution, but on the other hand it might lead to duplicative logic to re-discover the mechanics behind bugs found.

> I checked the session logs to see how often the agents were actually invoking the LSP tools. The answer was they had invoked them literally once the entire time.

I think the intent behind 'install a language server plugin' is that these tools should lint automatically after every edit, without waiting for an explicit call from the LLM.


Replies

mil22today at 6:04 PM

> The subagent approach is structurally different from the others because it runs with clean context.

Yes, and this is what I mean by "which context the prompt runs in". The subagent approach is different and has pros and cons, and it may in some situations be better (but perhaps not in others). On the other hand, I can also just create a new conversation and paste my own review prompt into it; then take the last turn's summary output and feed it back into my main conversation thread in the unusual event I would need to do so. Spawning a subagent is a convenient shortcut for this, but ultimately, it's the same thing.

> I think the intent behind 'install a language server plugin' is that these tools should lint automatically after every edit, without waiting for an explicit call from the LLM.

This is a great point and I had only checked my session logs for explicit tool calls. I went back and looked for diagnostics injected automatically by the harness after every edit, and whether the agent made use of them.

Claude: neither the Rust or Dart LSPs ever inserted any diagnostic events, but Ty did. Across 627 sessions, ty-lsp injected diagnostics blocks in 186 sessions, with a total of 33 findings. Out of those 33, 32 were dismissed as unrelated (13) or pre-existing (19). Only 1 finding was acted upon. The model is in the habit of running the batch analysis tools (ruff, ty, cargo clippy etc.) and prek anyway, so it would have caught that diagnostic regardless.

Codex: no diagnostic events were inserted by any of the LSPs.

So I won't be reinstalling those LSPs.