logoalt Hacker News

simonwyesterday at 3:31 PM24 repliesview on HN

I'm finding the "adaptive thinking" thing very confusing, especially having written code against the previous thinking budget / thinking effort / etc modes: https://platform.claude.com/docs/en/build-with-claude/adapti...

Also notable: 4.7 now defaults to NOT including a human-readable reasoning token summary in the output, you have to add "display": "summarized" to get that: https://platform.claude.com/docs/en/build-with-claude/adapti...

(Still trying to get a decent pelican out of this one but the new thinking stuff is tripping me up.)


Replies

JamesSwiftyesterday at 5:29 PM

Its especially concerning / frustrating because boris’s reply to my bug report on opus being dumber was “we think adaptive thinking isnt working” and then thats the last I heard of it: https://news.ycombinator.com/item?id=47668520

Now disabling adaptive thinking plus increasing effort seem to be what has gotten me back to baseline performance but “our internal evals look good“ is not good enough right now for what many others have corroborated seeing

show 4 replies
avaeryesterday at 4:04 PM

> Still trying to get a decent pelican out of this one but the new thinking stuff is tripping me up

Wouldn't that be p-hacking where p stands for pelican?

show 2 replies
shawnzyesterday at 5:27 PM

Note that for Claude Code, it looks like they added a new undocumented command line argument `--thinking-display summarized` to control this parameter, and that's the only way to get thinking summaries back there.

VS Code users can write a wrapper script which contains `exec "$@" --thinking-display summarized` and set that as their claudeCode.claudeProcessWrapper in VS Code settings in order to get thinking summaries back.

show 1 reply
puppystenchyesterday at 4:27 PM

Does this mean Claude no longer outputs the full raw reasoning, only summaries? At one point, exposing the LLM's full CoT was considered a core safety tenet.

show 6 replies
p_stuart82yesterday at 4:26 PM

yeah they took "i pick the budget" and turned it into "trust us".

show 1 reply
lukanyesterday at 4:17 PM

"Also notable: 4.7 now defaults to NOT including a human-readable reasoning token summary in the output, you have to add "display": "summarized" to get that"

I did not follow all of this, but wasn't there something about, that those reasoning tokens did not represent internal reasoning, but rather a rough approximation that can be rather misleading, what the model actual does?

show 3 replies
XCSmetoday at 12:37 AM

The reasoning modes are really weird with 4.7

In my tests, asking for "none" reasoning resulted in higher costs than asking for "medium" reasoning...

Also, "medium" reasoning only had 1/10 of the reasoning tokens 4.6 used to have.

show 2 replies
simonwyesterday at 5:45 PM

... here's the pelican, I think Qwen3.6-35B-A3B running locally did a better job! https://simonwillison.net/2026/Apr/16/qwen-beats-opus/

show 3 replies
devmortoday at 4:03 AM

> Also notable: 4.7 now defaults to NOT including a human-readable reasoning token summary in the output, you have to add "display": "summarized" to get that

That’s extremely bothersome because half of what helps teams build better guardrails and guidelines for agents is the ability to do deep analysis on session transcripts.

I guess we shouldn’t be surprised these vendors want to do everything they can to force users to rely explicitly on their offerings.

markrogersjryesterday at 5:42 PM

CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1 claude…

show 2 replies
dgb23yesterday at 4:19 PM

Don't look at "thinking" tokens. LLMs sometimes produce thinking tokens that are only vaguely related to the task if at all, then do the correct thing anyways.

show 5 replies
maximgranyesterday at 6:54 PM

https://github.com/anthropics/claude-agent-sdk-python/pull/8... - created PR for that cause hit it in their python sdk

nextaccounticyesterday at 7:31 PM

If you do include reasoning tokens you pay more, right?

j45today at 1:24 AM

Prompts seem to need to evolve with every new model.

Razengantoday at 12:16 AM

Claude Opus 4.6 has been hilarious for me so far: https://i.imgur.com/jYawPDY.png

show 1 reply
haellsighyesterday at 4:14 PM

[dead]

cyanydeezyesterday at 5:07 PM

It's likely hiding the model downgrade path they require to meet sustainable revenue. Should be interesting if they can enshittify slowly enough to avoid the ablative loss of customers! Good luck all VCs!

show 1 reply
vdalhambratoday at 4:25 AM

[flagged]

zhonghuajintoday at 2:59 AM

[flagged]

boxingdogyesterday at 9:44 PM

[dead]