Yeah this is exactly my view. We've had several years of work on the tech, and LLMs are just as prone to randomly spitting out garbage as they were the first day. They are not a tool which is fit for any serious work, because you need to be able to rely on your tools. A tool which is sometimes good and sometimes bad is worse than having no tool at all.
Do you really think that Opus 4.6 hallucinates to exactly the same degree as GPT-3.5? I am mystified how you can hold this perspective.
Did google not rely on Gemini to do their ISA changeover?
https://arxiv.org/abs/2510.14928
Was Gemini worse than no tool at all there?