I have gone through this process and evaluated the results. Maybe you're referring to their comment as written, but going through what OC described + handholding leads to very good results in my experience.
"very good" 99 percent of time and hallucinating 1 percent makes the "very good" part untrustworthy.
I agree with you agentdev! Here, you want accurate results, you need to have harness in place to control the quality of output.