Eight more months of agents

60 points • by arrowsmith • yesterday at 11:00 AM • 54 comments • view on HN

Comments

I have no problem with experienced senior devs using agents to write good code faster. What I have a problem with is inexperienced "vibecoders" who don't care to learn and instead use agents to write awful buggy code that will make the product harder to build on even for the agents. It used to be that lack of a basic understanding of the system was a barrier for people, but now it's not, so we're flooded with code written by imperfect models conducted by people who don't know good from bad.

➕ show 3 replies

dagss • today at 7:20 PM

    But if you try some penny-saving cheap model like Sonnet [..bad things..]. [Better] pay through the nose for Opus.

After blowing $800 of my bootstrap startup funds for Cursor with Opus for myself in a very productive January I figured I had to try to change things up... so this month I'm jumping between Claude Code and Cursor, sometimes writing the plans and having the conversation in Cursor and dump the implementation plan into Claude.

Opus in Cursor is just so much more responsive and easy to talk to, compared to Opus in Claude.

Cursor has this "Auto" mode which feels like it has very liberal limits (amortized cost I guess) that I'm also trying to use more, but -- I don't really like to flip a coin and if it lands up head then waste half hour discovering the LLM made a mess the LLM and try again forcing the model.

Perhaps in March I'll bite the bullet and take this authors advice.

➕ show 1 reply

joefourier • today at 7:30 PM

The author is correct in that agents are becoming more and more capable and that you don't need the IDE to the same extent, but I don't see that as good. I find that IDE-based agentic programming actually encourages you to read and understand your codebase as opposed to CLI-based workflows. It's so much easier to flip through files, review the changes it made, or highlight a specific function and give it to the agent, as opposed to through the CLI where you usually just give it an entire file by typing the name, and often you just pray that it manages to find the context by itself. My prompts in Cursor are generally a lot more specific and I get more surgical results than with Claude Code in the terminal purely because of the convenience of the UX.

But secondly, there's an entire field of LLM-assisted coding that's being almost entirely neglected and that's code autocomplete models. Fundamentally they're the same technology as agents and should be doing the same thing: indexing your code in the background, filtering the context, etc, but there's much less attention and it does feel like the models are stagnating.

I find that very unfortunate. Compare the two workflows:

With a normal coding agent, you write your prompt, then you have to at least a full minute for the result (generally more, depending on the task), breaking your flow and forcing you to task-switch. Then it gives you a giant mass of code and of course 99% of the time you just approve and test it because it's a slog to read through what it did. If it doesn't work as intended, you get angry at the model, retry your prompt, spending a larger amount of tokens the longer your chat history.

But with LLM-powered auto-complete, when you want, say, a function to do X, you write your comment describing it first, just like you should if you were writing it yourself. You instantly see a small section of code and if it's not what you want, you can alter your comment. Even if it's not 100% correct, multi-line autocomplete is great because you approve it line by line and can stop when it gets to the incorrect parts, and you're not forced to task switch and you don't lose your concentration, that great sense of "flow".

Fundamentally it's not that different from agentic coding - except instead of prompting in a chatbox, you write comments in the files directly. But I much prefer the quick feedback loop, the ability to ignore outputs you don't want, and the fact that I don't feel like I'm losing track of what my code is doing.

➕ show 1 reply

happytoexplain • today at 7:04 PM

I don't trust the idea of "not getting", "not understanding", or "being out of touch" with anti-LLM (or pro-LLM) sentiment. There is nothing complicated about this divide. The pros and cons are both as plain as anything has ever been. You can disagree - even strongly - with either side. You can't "not understand".

➕ show 3 replies

monus • today at 9:07 PM

> Along the way I have developed a programming philosophy I now apply to everything: the best software for an agent is whatever is best for a programmer.

Not a plug but really that’s exactly why we’re building sandboxes for agents with local laptop quality. Starting with remote xcode+sim sandboxes for iOS, high mem sandbox with Android Emulator on GPU accel for Android.

No machine allocation but composable sandboxes that make up a developer persona’s laptop.

If interested, a quick demo here https://www.loom.com/share/c0c618ed756d46d39f0e20c7feec996d

muvaf[at]limrun[dot]com

dmk • today at 7:19 PM

The real insight buried in here is "build what programmers love and everyone will follow." If every user has an agent that can write code against your product, your API docs become your actual product. That's a massive shift.

➕ show 1 reply

dang • today at 7:08 PM

Related. Others?

How I program with agents - https://news.ycombinator.com/item?id=44221655 - June 2025 (295 comments)

hasperdi • today at 6:59 PM

> It sounds like someone saying power tools should be outlawed in carpentry.

I see this a lot here

➕ show 3 replies

dirkc • today at 7:12 PM

> Using anything other than the frontier models is actively harmful

If that is true, why should one invest in learning now rather than waiting for 8 months to learn whatever is the frontier model then?

➕ show 6 replies

post-it • today at 7:52 PM

> Agent harnesses have not improved much since then. There are things Sketch could do well six months ago that the most popular agents cannot do today.

I think this is a neglected area that will see a lot of development in the near future. I think that even if development on AI models stopped today - if no new model was ever trained again - there are still decades of innovation ahead of us in harnessing the models we already have.

Consider ChatGPT: the first release relied entirely on its training data to answer questions. Today, it typically does a few Google searches and summarizes the results. The model has improved, but so has the way we use it.

➕ show 1 reply

emmawirt • today at 7:22 PM

Curious what you mean by "agent harness" here... are you distinguishing between true autonomous agents (model decides next step) vs workflows that use LLMs at specific nodes? I've found the latter dramatically more reliable for anything beyond prototyping, which makes me wonder if the "model improvement" is partly better prompting and scaffolding.

➕ show 2 replies

dsign • today at 7:33 PM

Look, I'm very negative about this AI thing. I think there is a great chance it will lead to something terrible and we will all die, or worse. But on the other hand, we are all going to die anyway. Some of us, the lucky ones, will die of a heart attack and will learn of our imminent demise in the second it happens, or not at all. The rest of us will have it worse. It has always been like that, and it has only gotten more devastating since we started wearing clothes and stopped being eaten alive by a savanna crocodile or freezing to death during the first snowfall of winter.

But if AI keeps getting better at code, it will produce entire in-silico simulation workflows to test new drugs or even to design synthetic life (which, again, could make us all die, or worse). Yet there is a tiny, tiny chance we will use it to fix some of the darkest aspects of human existence. I will take that.

uludag • today at 8:08 PM

> I am having more fun programming than I ever have, because so many more of the programs I wish I could find the time to write actually exist. I wish I could share this joy with the people who are fearful about the changes agents are bringing.

It might be just me but this reads as very tone deaf. From my perspective, CEOs are seething at the mouth to make as many developers redundant as possible, not being shy about this desire. (I don't see this at all as inevitable, but tech leaders have made their position clear)

Like, imagine the smugness of some 18th century "CEO" telling an artisan, despite the fact that he'l be resigned to working in horrific conditions at a factory, to not worry and think of all the mass produced consumer goods he may enjoy one day.

It's not at all a stretch of the imagination that current tech workers may be in a very precarious situation. All the slopware in the world wouldn't console them.

➕ show 1 reply

Herring • today at 7:27 PM

> In 2000, less than one percent lived on farms and 1% of workers are in agriculture. That was a net benefit to the world, that we all don't have to work to eat.

The jury's still out on that one, because climate change is an existential risk.

➕ show 1 reply

Krei-se • today at 7:34 PM

The author has a github.

almostdeadguy • today at 6:59 PM

In the past couple days I've become less skeptical of the capabilities of LLMs and now more alarmed by them, contra the author. I think if we as a society continue to accept the development of LLMs and the control of them by the major AI companies there will be massively negative repercussions. And I don't mean repercussions like "a rogue AI will destroy humanity" per se, but these things will potentially cause massive social upheaval, a large amount of negative impacts on mental health and cognition, etc. I think if you see LLMs as powerful but not dangerous you are not being honest.

➕ show 2 replies

alt Hacker News

Eight more months of agents

Comments