ANTI_DISTILLATION_CC
This is Anthropic's anti-distillation defence baked into Claude Code. When enabled, it injects anti_distillation: ['fake_tools'] into every API request, which causes the server to silently slip decoy tool definitions into the model's system prompt. The goal: if someone is scraping Claude Code's API traffic to train a competing model, the poisoned training data makes that distillation attempt less useful.It looks like it worked, fwiw.
The qwen 27b model distilled on Opus 4.6 has some known issues with tool use specifically: https://x.com/KyleHessling1/status/2038695344339611783
Fascinating.
I was thinking just yesterday that the research that Anthropic was sharing regarding how it's easy to poison training was unlikely to be conducted out of goodness of the heart.
I like these guys less every day. The rate limits are so low they are close to not even useful as a provider.
Haven’t looked at the code, but is the server providing the client with a system prompt that it can use, which would contain fake tool definitions when this is enabled? What enables it? And why is the client still functional when it’s giving the server back a system prompt with fake tool definitions? Is the LLM trained to ignore those definitions?
Wonder if they’re also poisoning Sonnet or Opus directly generating simulated agentic conversations.
Why would this be in the client code though?
Paranoia. And also ironic considering their base LLM is a distillation of the web and books etc etc.