Yes it is relevant and testable. It's exactly what I meant by "a measurable increase in qu...

fiso64 • yesterday at 12:03 PM • 1 reply • view on HN

Yes it is relevant and testable. It's exactly what I meant by "a measurable increase in quality of the final product". In fact a proper test harness would reveal that problem. You are forgetting that with LLMs, testing software does not have to end at the usual unit/integration/e2e level.

Replies

ricardobeat • yesterday at 1:28 PM

But how is that testable? If your test is validating the rigidity, water resistance, etc, they will all pass even if the underlying material is a bad choice. Or the glue will degrade in six months.

You can't test if a codebase will be extensible or maintainable as requirements change in the future, if the abstraction level or architecture is sound - that's down to code quality measures like the ones used here. LLMs are very good at slightly cheating to pass tests even when the implementation is wrong. Introducing subjectivity - the kind of input a human will provide - leads to improved output.

https://senior-swe-bench.snorkel.ai/blog/2026-06-16-how-it-w...

➕ show 1 reply

alt Hacker News

Replies