Whatever it says is not always what it is doing https://transformer-circuits.pub/2025/attribution-graphs/bio...
> The computation we can see looks like it’s just guessing the answer, despite the chain of thought suggesting it’s computed it using a calculator.
It might be hallucinating or lying, it's not like you are actually observing the internals of the model.