There are now several comments that (incorrectly?) interpret the undercover mode as only hiding internal information. Excerpts from the actual prompt[0]:
NEVER include in commit messages or PR descriptions:
- The phrase "Claude Code" or any mention that you are an AI
- Co-Authored-By lines or any other attribution
BAD (never write these):
- 1-shotted by claude-opus-4-6
- Generated with Claude Code
- Co-Authored-By: Claude Opus 4.6 <…>
This very much sounds like it does what it says on the tin, i.e. stays undercover and pretends to be a human. It's especially worrying that the prompt is explicitly written for contributions to public repositories.[0]: https://github.com/chatgptprojects/claude-code/blob/642c7f94...
I would have expected people (maybe a small minority, but that includes myself) to have already instructed Claude to do this. It’s a trivial instruction to add to your CLAUDE.md file.
It's less about pretending to be a human and more about not inviting scrutiny and ridicule toward Claude if the code quality is bad. They want the real human to appear to be responsible for accepting Claud's poor output.
Ive seen it say coauthored by claude code on my prs...and I agree I dont want it to do that
None of this is really worrying, this is a pattern implemented in a similar way by every single developer using AI to write commit messages after noticing how exceptionally noisy they are to self-attribute things. Anthropics views on AI safety and alignment with human interests dont suddenly get thrown out with the bathwater because of leaked internal tooling of which is functionally identical to a basic prompt in a mere interface (and not a model). I dont really buy all the forced "skepticism" on this thread tbh.
You can already turn off "Co-Authored-By" via Claude Code config. This is what their docs show:
~/.claude/settings.json
{
"attribution": {
"commit": "",
"pr": ""
},
The rest of the prompt is pretty clear that it's talking about internal use.Claude Code users aren't the ones worried about leaking "internal model codenames" nor "unreleased model opus-4-8" nor Slack channel names. Though, nobody would want that crap in their generated docs/code anyways.
Seems like a nothingburger, and everyone seems to be fantasizing about "undercover mode" rather than engaging with the details.
My first reaction is that they are using this to take advantage of OSS reviewers for in the wild evals.
People make fun that we should say magic words in interaction with LLMs. How frustrated can Claude be? /s
I cringe every time I see Claude trying to co-author a commit. The git history is expected to track accountability and ownership, not your Bill of Tools. Should I also co-author my PRs with my linter, intellisense and IDE?