I won't use or recommend models with hidden reasoning, (thats all American models). It's t...

irthomasthomas • today at 3:31 PM • 8 replies • view on HN

I won't use or recommend models with hidden reasoning, (thats all American models). It's too much of a risk and makes prompt optimization harder. Risky because it makes it possible for an attacker to prompt inject the reasoning chain to carry out a secret objective, and to hide that from the summary and output.

Interleaved reasoning and function calling makes this even more dangerous. A model can call functions during the hidden reasoning phase. An attacker could then exfiltrate data from you while the reasoning summary hides it from the user.

It also makes it impossible to know if the model is doomplooping during reasoning and burning tokens for no reason, as gemini is want to do, which we know about because its hidden reasoning often leaks out when it doomloops.

When the models are AGI and secure from prompt injection I may stop caring, until then I want to know exactly what the model responds to my prompts. or exactly what the agent is doing on my behalf.

Edit, further reading: Fooling around with encrypted reasoning blobs https://blog.cryptographyengineering.com/2026/05/29/fooling-...

Replies

paweladamczuk • today at 5:41 PM

I don't think there can be tool calls inside the obfuscated reasoning blocks. I mean, in order for those function calls to be evaluated client-side, that thinking stream would have to be decrypted on the client side at some point, which would defeat the purpose of obfuscating it the way they do.

If you mean the function calls might happen server side, there is nothing preventing the server from doing it and hiding it from you as long as you are using an API for inference.

➕ show 2 replies

Roritharr • today at 3:53 PM

I've thought about the high-jacking of reasoning-chains as a potential vector, but never saw a proven implementation in american models since, from my understanding, all major vendors throw out the reasoning tokens between turns.

➕ show 4 replies

varenc • today at 6:10 PM

As long as thinking blocks can't make tool calls, I don't really see the exfiltration risk.

Bolwin • today at 5:46 PM

> Interleaved reasoning and function calling makes this even more dangerous. A model can call functions during the hidden reasoning phase.

The reasoning may be hidden but the tool calls are not, how else would the client execute them

➕ show 1 reply

pixlmint • today at 5:51 PM

Do they do the same when using the model through API in something like Opencode?

➕ show 1 reply

kapperchino • today at 4:03 PM

This agent I made can’t execute on the shell, can only edit the files within the project. Only works with rust atm though. https://github.com/Kapperchino/agent-joe

zahlman • today at 5:31 PM

> an attacker

... what exactly is your threat model? How are "attackers" getting themselves involved in the first place?

➕ show 1 reply

alt Hacker News

Replies