logoalt Hacker News

ryanackleyyesterday at 2:28 PM2 repliesview on HN

> The thing with coding agents is that it seems now that you can eat your cake and have it too. We are all still adapting, but results indicate that given the right prompts and processes harnessing LLMs quality code can be had in the cheap.

It's cheaper but not cheap

If you're building a variation of a CRUD web app, or aggregating data from some data source(s) into a chart or table, you're right. It's like magic. I never thought this type of work was particularly hard or expensive though.

I'm using frontier models and I've found if you're working on something that hasn't been done by 100,000 developers before you and published to stackoverflow and/or open source, the LLM becomes a helpful tool but requires a ton of guidance. Even the tests LLMs will write seem biased to pass rather than stress its code and find bugs.


Replies

zahlmanyesterday at 6:02 PM

> I never thought this type of work was particularly hard or expensive though.

Maybe not intrinsically hard, but hard because it's so boring you can't concentrate.

> the LLM becomes a helpful tool but requires a ton of guidance. Even the tests LLMs will write seem biased to pass rather than stress its code and find bugs.

ISTR some have had success by taking responsibility for the tests and only having the LLM work on the main code. But since I only seem to recall it, that was probably a while ago, so who knows if it's still valid.

gchamonliveyesterday at 4:41 PM

> It's cheaper but not cheap

It's quite cheap if you consider developer time. But it's only as cheap as you can effectively drive the model, otherwise you are just wasting tokens on garbage code.

> LLM becomes a helpful tool but requires a ton of guidance

I think this is always going to be the case. You are driving the agent like you drive a bike, it'll get you there but you need to be mindful of the clueless kid crossing your path.

For some projects I had good results just letting the agent loose. For others I'd have to make the tasks more specific and granular before offloading to the LLM. I see nothing wrong with it.