logoalt Hacker News

devinyesterday at 4:12 PM17 repliesview on HN

> If you can go from producing 200 lines of code a day to 2,000 lines of code a day, what else breaks? The entire software development lifecycle was, it turns out, designed around the idea that it takes a day to produce a few hundred lines of code. And now it doesn’t.

It is so embarrassing that LOC is being used as a metric for engineering output.


Replies

ilikebitsyesterday at 4:17 PM

LOC is useful here not because it's a metric for output but because it's a metric for _understandability_. Reviewing 200 lines is a very different workload than reviewing 2000.

show 3 replies
keedayesterday at 6:21 PM

LoC is perfectly fine as a metric for engineering output. It is terrible as a standalone measure of engineering productivity, and the problems occur when one tries to use it as such.

It's still useful, however, because that is the only metric that is instantly intuitively understandable and comparable across a wide variety of contexts, i.e. across companies and teams and languages and applications.

As we know, within the same team working on the same product, a 1000 LoC diff could take less time than a 1 line bug fix that took days to debug. Hence we really cannot compare PRs or product features or story points across contexts. If the industry could come up with a standard measure of developer productivity, you'd bet everyone would use it, but it's unfeasible basically for this very reason.

So, when such comparisons are made (and in this case it was clearly a colloquial usage), it helps to assume the context remains the same. Like, a team A working on product P at company C using tech stack T with specific software quality processes Q produced N1 lines of code yesterday, but today with AI they're producing N2 lines of code. Over time the delta between N1 and N2 approximates the actual impact.

(As an aside, this is also what most of the rigorous studies in AI-assisted developer productivity have done: measure PRs across the same cohorts over time with and without AI, like an A/B test.)

faizshahyesterday at 4:33 PM

I experimented with vibe coding (not looking at the code myself) and it produced around 10k LOC even after refactors etc.

I rewrote the same program using my own brain and just using ChatGPT as google and autocomplete (my normal workflow), I produced the same thing in 1500 LOC.

The effort difference was not that significant either tbh although my hand coded approach probably benefited from designing the vibe coded one so I had already though of what I wanted to build.

show 1 reply
jwpapiyesterday at 11:53 PM

I deleted 75000 lines of code of my codebase in the last 2 months and that was tremendously more useful to by business than the 75000 AI has written the 2 months before...

mcmcmcyesterday at 4:18 PM

Is it? The whole point of the article is that the rate of output for writing code has surpassed the rate at which it can be reviewed by humans. LOC as an input for software review makes a lot of sense, since you literally need to read each line.

adtacyesterday at 4:18 PM

LOC is the worst metric for engineering output, except for all the others - Churchill

show 1 reply
root_axisyesterday at 4:31 PM

He's not using LOC as a metric, he's making an observation about the impact of a change in the typical volume of LOC.

np1810today at 3:15 AM

I just read somewhere on HN that "code is a liability, not an asset, the idea behind the code/final product is the actual asset." And, I can't agree more...

> It is so embarrassing that LOC is being used as a metric for engineering output.

In one of my previous org, LOC added in the previous year was a metric used to find out a good engineer v/s a PIP (bad) engineer. Also, LOC removed was treated as a negative metric for the same. I hope they've changed this methodology for LLM code-spitting era...

etothetyesterday at 4:16 PM

Agreed. And, LOC has historically been one of the things we've collectively fought against management for how to evalute a "productive" developer!

show 1 reply
autoconfigyesterday at 11:42 PM

The charitable interpretation here is obviously that the LoCs are equivalent in quality, in which case it is a very useful metric in the context that was presented. The inability to infer that should be embarrassing.

sva_yesterday at 10:07 PM

I wonder if '2000 LOC' was chosen to refer to this old anecdote from the 80s:

https://www.folklore.org/Negative_2000_Lines_Of_Code.html

vrganjyesterday at 4:29 PM

I read somewhere that measuring software engineering output by LoC is like measuring aerospace engineering by pounds added to the plane and I thought that was an apt comparison.

hungryhobbityesterday at 4:39 PM

Humans are also incredibly varied and different.

Do you reject all stats that treat the number of people involved (eg. 2 million pepole protested X) as "embarrassing" ... because they lump incredibly varied people together and pretend they're equal?

moomoo11yesterday at 10:51 PM

I follow Garry Tan on X and he’s a big proponent of LOCmaxxing using AI.

AI helps eng ship more and faster, I think that’s the takeaway.

dyauspitryesterday at 6:14 PM

Honestly it’s more like 200 to a 100,000 of pretty decent quality code at this point.

kashyapcyesterday at 4:23 PM

Totally. I thought Simon was wiser than this; even he couldn't resist getting swept up by breathless hype. The moment you start typing "LOC as a metric", alarm bells should go off in your head.

show 2 replies
estimator7292yesterday at 4:16 PM

At least "mentions of LOC" is now a great metric for "how clueless is this person"