A lot of agent workflows really are just tool selection + argument extraction + structured output. How does this behave once workflows become multi-step and state starts accumulating across calls?