logoalt Hacker News

Implicatedyesterday at 4:23 AM7 repliesview on HN

Not trying to be snarky, with all due respect... this is a skill issue.

It's a tool. It's a wildly effective and capable tool. I don't know how or why I have such a wildly different experience than so many that describe their experiences in a similar manner... but... nearly every time I come to the same conclusion that the input determines the output.

> If they implement something with a not-so-great approach, they'll keep adding workarounds or redundant code every time they run into limitations later.

Yes, when the prompt/instructions are overly broad and there's no set of guardrails or guidelines that indicate how things should be done... this will happen. If you're not using planning mode, skill issue. You have to get all this stuff wrapped up and sorted before the implementation begins. If the implementation ends up being done in a "not-so-great" approach - that's on you.

> If you tell them the code is slow

Whew. Ok. You don't tell it the code is slow. Do you tell your coworker "Hey, your code is slow" and expect great results? You ask it to benchmark the code and then you ask it how it might be optimized. Then you discuss those options with it (this is where you do the part from the previous paragraph, where you direct the approach so it doesn't do "no-so-great approach") until you get to a point where you like the approach and the model has shown it understands what's going on.

Then you accept the plan and let the model start work. At this point you should have essentially directed the approach and ensured that it's not doing anything stupid. It will then just execute, it'll stay within the parameters/bounds of the plan you established (unless you take it off the rails with a bunch of open ended feedback like telling it that it's buggy instead of being specific about bugs and how you expect them to be resolved).

> you can have 10 bespoke tests for every bug. Plus a new mocking framework created every time the last one turns out to be unfit for purpose.

This is an area I will agree that the models are wildly inept. Someone needs to study what it is about tests and testing environments and mocking things that just makes these things go off the rails. The solution to this is the same as the solution to the issue of it keeping digging or chasing it's tail in circles... Early in the prompt/conversation/message that sets the approach/intent/task you state your expectations for the final result. Define the output early, then describe/provide context/etc. The earlier in the prompt/conversation the "requirements" are set the more sticky they'll be.

And this is exactly the same for the tests. Either write your own tests and have the models build the feature from the test or have the model build the tests first as part of the planned output and then fill in the functionality from the pre-defined test. Be very specific about how your testing system/environment is setup and any time you run into an issue testing related have the model make a note about that and the solution in a TESTING.md document. In your AGENTS.md or CLAUDE.md or whatever indicate that if the model is working with tests it should refer to the TESTING.md document for notes about the testing setup.

Personally, I focus on the functionality, get things integrated and working to the point I'm ready to push it to a staging or production (yolo) environment and _then_ have the model analyze that working system/solution/feature/whatever and write tests. Generally my notes on the testing environment to the model are something along the lines of a paragraph describing the basic testing flow/process/framework in use and how I'd like things to work.

The more you stick to convention the better off you'll be. And use planning mode.


Replies

riffraffyesterday at 7:02 AM

> Whew. Ok. You don't tell it the code is slow. Do you tell your coworker "Hey, your code is slow" and expect great results?

Yes? Why don't you?

They are capable people that just didn't notice something, id I notice some telemetry and tell them "hey this is slow" they are expected to understand the reason(s).

show 4 replies
pornelyesterday at 1:33 PM

My comment was a summary of the situation, not literal prompts I use. I absolutely realize the work needs to be adequately described and agents must be steered in the right direction. The results also vary greatly depending on the task and the model, so devs see different rates of success.

On non-trivial tasks (like adding a new index type to a db engine, not oneshotting a landing page) I find that the time and effort required to guide an LLM and review its work can exceed the effort of implementing the code myself. Figuring out exactly what to do and how to do it is the hard part of the task. I don't find LLMs helpful in that phase - their assessments and plans are shallow and naive. They can create todo lists that seemingly check off every box, but miss the forest for the trees (and it's an extra work for me to spot these problems).

Sometimes the obvious algorithm isn't the right one, or it turns out that the requirements were wrong. When I implement it myself, I have all the details in my head, so I can discover dead-ends and immediately backtrack. But when LLM is doing the implementation, it takes much more time to spot problems in the mountains of code, and even more effort to tell when it's a genuinely a wrong approach or merely poor execution.

If I feed it what I know before solving the problem myself, I just won't know all the gotchas yet myself. I can research the problem and think about it really hard in detail to give bulletproof guidance, but that's just programming without the typing.

And that's when the models actually behave sensibly. A lot of the time they go off the rails and I feel like a babysitter instructing them "no, don't eat the crayons!", and it's my skill issue for not knowing I must have "NO eating crayons" in AGENTS.md.

show 1 reply
brabelyesterday at 8:41 AM

Great answer, and the reason some people have bad experiences is actually patently clear: they don’t work with the AI as a partner, but as a slave. But even for them, AI is getting better at automatically entering planning mode, asking for clarification (what exactly is slow, can you elaborate?), saying some idea is actually bad (I got that a few times), and so on… essentially, the AI is starting to force people to work as a partner and give it proper information, not just tell them “it’s broken, fix it” like they used to do on StackOverflow.

girvoyesterday at 8:31 AM

I absolutely tell a coworker their code is slow and expect them to fix it…

show 1 reply
raincoleyesterday at 2:44 PM

> Do you tell your coworker "Hey, your code is slow" and expect great results? You ask it to benchmark the code and then you ask it how it might be optimized.

...Really? I think 'hey we have a lot of customers reporting the app is laggy when they do X, could you take a look' is a very reasonable thing to tell your coworker who implemented X.

otabdeveloper4yesterday at 6:37 AM

It is not a tool. It is an oracle.

It can be a tool, for specific niche problems: summarization, extraction, source-to-source translation -- if post-trained properly.

But that isn't what y'all are doing, you're engaging in "replace all the meatsacks AGI ftw" nonsense.

show 1 reply
5o1ecistyesterday at 7:54 AM

[dead]