logoalt Hacker News

krashidovtoday at 4:14 AM0 repliesview on HN

> We've got a QA agent that needs to run through, say, 200 markdown files of requirements in a browser session. Its a cool system that has really helped improve our team's efficiency. For the longest time we tried everything to get a prompt like the following working: "Look in this directory at the requirements files. For each requirement file, create a todo list item to determine if the application meets the requirements outlined in that file". In other words: Letting the model manage the high level control flow.

This is cool. Can you elaborate on it? Is it flaky? Does it take a long time?