Nice LLM generated text. Now go read

jychang • last Friday at 10:22 AM • 1 reply • view on HN

Nice LLM generated text.

Now go read https://transformer-circuits.pub/2024/scaling-monosemanticit... or https://arxiv.org/abs/2506.19382 to see why that text is outdated. Or read any paper in the entire field of mechanistic interpretability (from the past year or two), really.

Hint: the first paper is titled "Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet" and you can ctrl-f for "We find three different safety-relevant code features: an unsafe code feature 1M/570621 which activates on security vulnerabilities, a code error feature 1M/1013764 which activates on bugs and exceptions"

Who said I want a discussion? I want ignorant people to STOP talking, instead of talking as if they knew everything.

Replies

emp17344 • last Friday at 6:42 PM

Your entire argument is derived from a pseudoscientific field without any peer-reviewed research. Mechanistic interpretability is a joke invented by AI firms to sell chatbots.

➕ show 1 reply

alt Hacker News

Replies