I do this kind of parallelism with a little merge request tool I slopped together. I spin up multiple small agents and assign them specific code review tasks (security, coding standards, etc.) and have it spit out a gitlab API draft json object with code examples for the MR I can deterministically validate against. If it fails to insert code examples (depending on the task) and the proper json object schema, I have "ask it to try again" logic in place.
Works fine, forcing LLMs to output parsable responses is a good workaround to get them to do what you want until they improve. It also allows you to use the fast models (ex. I spin up the Gemini 3.1 flash lite model for these tasks) to have these tasks done in seconds rather than minutes.
Similar to your method