The verbalizer and reconstruction models are both initially finetuned on LLM output from a summariza...

yorwba • today at 10:28 AM • 0 replies • view on HN

The verbalizer and reconstruction models are both initially finetuned on LLM output from a summarization prompt. The resulting text is not completely unrelated, but mostly wrong: https://transformer-circuits.pub/2026/nla/png/img_18fcfc16e9... The reconstructed activations are also far from matching the verbalizer's input. It's not unusual in machine learning to have results that are shit and SOTA at the same time, simply because there's no other technique that works better.

alt Hacker News