logoalt Hacker News

LPisGoodtoday at 4:36 PM1 replyview on HN

The point of this post isn’t that the “reasoning” phase of LLM thinking isn’t the same as what humans consider reasoning; it’s that Anthropic is intentionally hiding Claude’s “reasoning output” to make the model harder to distill.


Replies

0o_MrPatrick_o0today at 4:47 PM

Reading these comments is so harrowing.

You are correct in my intentions on this post generally.

I want to highlight:

I want to measure performance of the LLMs over time- which includes assessing the quality of their outputs. I don’t perceive the reasoning output to be anything other than a measurable signal of possible drift in model performance.

Except it isn’t, because I’m only getting a low value summary of the thinking.

It’s like asking your buddy how fast he thought that last pitch was when radar guns are behind the plate.

Yeah, it’s a description related to what happened, but it’s not the thing I want to measure.

show 1 reply