My agents often write themselves scripts. Isn't that effectively what you're asking for? Prompting for scripts can also be a useful time and accuracy tactic when you know it'll be a good fit for it.
Yeah, the problem is that I do not think the agents is good at reusing scripts and stitching it together.At least for me it's recreating to much similar. I hope we will see platforms like windmill.dev find the optimal solution for this. I have not been able to test it enough. But have a platform that gives you some observability out of the box and protect secrets from llm is nice
The problem is that code it spits out on the fly is untested and untrustworthy. Identify the parts of your workflow that could be accomplished with regular code - write and unit test that code, with LLM help if you want, and use the llm as the orchestrator only.