logoalt Hacker News

ForHackernewstoday at 5:19 PM0 repliesview on HN

Whatever it says is not always what it is doing https://transformer-circuits.pub/2025/attribution-graphs/bio...

> The computation we can see looks like it’s just guessing the answer, despite the chain of thought suggesting it’s computed it using a calculator.

It might be hallucinating or lying, it's not like you are actually observing the internals of the model.