Why do people keep insisting that LLMs don't follow a chain of reasoning process? Using the latest LLMs you can see exactly what they "think" and see the resultant output. Plausible code does not mean random code as you seem to imply, it means...code that could work for this particular situation.
Because they don't. The chain-of-reasoning feature is really just a way to get the LLM to prompt more.
The fact that it generates these "thinking" steps does not mean it is using them for reasoning. It's most useful effect is making it seem to a human that there is a reasoning process.