logoalt Hacker News

dgb23yesterday at 4:19 PM5 repliesview on HN

Don't look at "thinking" tokens. LLMs sometimes produce thinking tokens that are only vaguely related to the task if at all, then do the correct thing anyways.


Replies

gck1yesterday at 5:35 PM

Why does this comment appear every time someone complains about CoT becoming more and more inaccessible with Claude?

I have entire processes built on top of summaries of CoT. They provide tremendous value and no, I don't care if "model still did the correct thing". Thinking blocks show me if model is confused, they show me what alternative paths existed.

Besides, "correct thing" has a lot of meanings and decision by the model may be correct relative to the context it's in but completely wrong relative to what I intended.

The proof that thinking tokens are indeed useful is that anthropic tries to hide them. If they were useless, why would they even try all of this?

Starting to feel PsyOp'd here.

show 2 replies
shawnzyesterday at 5:30 PM

Thinking summaries might not be useful for revealing the model's actual intentions, but I find that they can be helpful in signalling to me when I have left certain things underspecified in the prompt, so that I can stop and clarify.

thepaschyesterday at 4:20 PM

They also sometimes flag stuff in their reasoning and then think themselves out of mentioning it in the response, when it would actually have been a very welcome flag.

show 1 reply
dataviz1000yesterday at 5:45 PM

Thinking helps the models arrive at the correct answer with more consistency. However, they get the reward at the end of a cycle. Turns out, without huge constraints during training thinking, the series of thinking tokens, is gibberish to humans.

I wonder if they decided that the gibberish is better and the thinking is interesting for humans to watch but overall not very useful.

show 1 reply
sharmsyesterday at 10:59 PM

This is because the "thinking" you see is a summary by a highly quantized model - not the actual model, to mask these tokens