Perhaps the most discussed technical detail is the “Undercover Mode.” This feature reveals that Anthropic uses Claude Code for “stealth” contributions to public open-source repositories.
The system prompt discovered in the leak explicitly warns the model: “You are operating UNDERCOVER… Your commit messages… MUST NOT contain ANY Anthropic-internal information. Do not blow your cover.”
Laws should have been put in place years ago to make it so that AI usage needs to be explicitly declared.
Be careful not to introduce security vulnerabilities such as command injection, XSS, SQL injection, and other OWASP top 10 vulnerabilities. If you notice that you wrote insecure code, immediately fix it.
Lmao. I’m sure that will solve the problem of it writing insecure slop code.
It doesn’t fix it, but as stupid as it looks, it should actually improve the chances.
If you’ve seen how the reasoning works, they basically spit out some garbage, then read it again and think whether it’s garbage enough or not.
They do try to ‘correct their errors’, so to say.
Best part of the leak, they use regex matches for sentiment lol
I think saw one of the keywords was dumbass. And another looked for you calling it a piece of shit
Lmao, so the LLM framework falls back to similar shit to what ALICE used?
Something in a song on my car radio triggered my phone to wake google yesterday and I casually told it to fuck off, and it replied “I’m sorry you’re upset. You can send feedback”
Adversarial audio, but just occurring by chance? Wild stuff. I was just looking into how to do that.
This is just the UI right? Or the models too?



