logoalt Hacker News

Build a Basic AI Agent from Scratch: Long Task Planning

114 pointsby ruxudevlast Tuesday at 2:29 PM45 commentsview on HN

Comments

athrowaway3ztoday at 12:50 PM

I've tried most form of planning - from the basic AGENTS.md guide to keeping ./dev/ plan files, todo list tools, sqlite db with both minimal and extensive tracking, etc.

None of them have been worth it. A year ago the models needed to be reminded. Today they can follow a plan from text alone. This is my experience from working on a project alone - in teams ... i actually think the same lesson holds in the new AI paradigm.

My current scheme is basically this - in order of the task's complexity:

- Tell an agent to do something

- Tell an agent to make a plan then tell it to execute on it.

- Tell an agent to make a plan, write to a file, have a subagent review it, then execute it.

- Do the above, but instead tell the agent they're in a supervise mode and to have subagents implement as many phases and rollover with a handoff.md while they, as the supervisor agent, keeps driving the task to completion.

The latter two i have under a sigil so they're prepared prompts i can inject with a few keystrokes.

If i feel very fancy i'll tell them to update the plan with a checklist and add checkboxes, but it just doesn't pay enough to have 'init-prompt' level planning feature or tools if in the same context you already have files/read/write.

show 3 replies
charles_ftoday at 2:40 PM

> In my case, I asked it to migrate my static site from using Eleventy to Hugo

This blog is on medium so I guess the migration went sideways!

Joke aside, nice series of tutorials, don't let the haters get to you. I think with the current token panic it might get handy soon

jdw64today at 12:23 PM

I don't understand why people criticize this post. When you run a homepage or a blog, it's unavoidable to write script style code. Even if the quality is a bit low, that's the limit within a tutorial. Because if you go into actual design, things like boundaries, policies, error handling, and so on require a lot of prior knowledge. So when certain knowledge is needed, you can only post something as a simple runnable script.

For example, if I were building real software, I would design everything from policy to error logging policies and so on. But when writing a blog post, it's just simplified into a short runnable script.

Havoctoday at 12:12 PM

What’s with all the aggression here. Not very hn

show 1 reply
b800htoday at 11:44 AM

Why do people use Medium?

show 2 replies
andaitoday at 3:33 PM

What's the point of the scratch pad? Isn't the same data already in the context? Or does it help because contexts are lossy and bias towards the start and end?

Similar question with the to-do list. Do they actually help task completion? Is there any research on that? I think they're less helpful with more recent models, but maybe they still help with smaller ones?

The system prompt asking it to make a plan before starting work does sound helpful though. (Of course it would also be great to see numbers there :)

mxkopytoday at 11:41 AM

Jesus the terminology is so fucked… compare the contents of this blog post with any RL paper containing the words “long term planning”…

chattermatetoday at 2:47 PM

[flagged]

eugmai86today at 1:01 PM

[dead]

swordlucky666today at 2:05 PM

[dead]

volume_techtoday at 1:05 PM

[flagged]

niggischiggitoday at 10:55 AM

Yeah yeah... the world needs even more "aI aGenTz". This will help fighting climate change and child starvation.

show 4 replies
aafaqzahidtoday at 11:56 AM

Are people using medium in 2026?

elxrtoday at 11:50 AM

Code tutorial on medium (who's formatting is absolutely not meant for this)?

Please stop posting.

show 1 reply