Don't look at "thinking" tokens. LLMs sometimes produce thinking tokens that are only vaguely related to the task if at all, then do the correct thing anyways.
Thinking summaries might not be useful for revealing the model's actual intentions, but I find that they can be helpful in signalling to me when I have left certain things underspecified in the prompt, so that I can stop and clarify.
They also sometimes flag stuff in their reasoning and then think themselves out of mentioning it in the response, when it would actually have been a very welcome flag.
Thinking helps the models arrive at the correct answer with more consistency. However, they get the reward at the end of a cycle. Turns out, without huge constraints during training thinking, the series of thinking tokens, is gibberish to humans.
I wonder if they decided that the gibberish is better and the thinking is interesting for humans to watch but overall not very useful.
This is because the "thinking" you see is a summary by a highly quantized model - not the actual model, to mask these tokens
Why does this comment appear every time someone complains about CoT becoming more and more inaccessible with Claude?
I have entire processes built on top of summaries of CoT. They provide tremendous value and no, I don't care if "model still did the correct thing". Thinking blocks show me if model is confused, they show me what alternative paths existed.
Besides, "correct thing" has a lot of meanings and decision by the model may be correct relative to the context it's in but completely wrong relative to what I intended.
The proof that thinking tokens are indeed useful is that anthropic tries to hide them. If they were useless, why would they even try all of this?
Starting to feel PsyOp'd here.