That is the whole challenge, actually! A new metric I'm going to dogfood into forge is ETTWS - estimated time to working solution.
A simple retry loop around your whole workflow could, in some cases, be all you need. But it could mean many blind attempts to get through a workflow successfully. And hopefully there isn't a payment step partway through!
The fewer hard errors nix the whole workflow, the lower your ETTWS.
Have you read the MAKER/MDAP paper? 1 million sequential tasks.
Is it strange that I immediately interpreted ETTWS to be Estimated Time To William Shakespeare?