Likely the directive in the source code for users internal to Anthropic that the LLM should not make any reference to being an LLM or mention model names etc in commit messages or comments. So when they contribute code to external repos it’s not immediately identifiable as LLM generated
What is the ‘instruction to mislead’ referring to?
Likely the directive in the source code for users internal to Anthropic that the LLM should not make any reference to being an LLM or mention model names etc in commit messages or comments. So when they contribute code to external repos it’s not immediately identifiable as LLM generated
The poison pills that are there to mislead you if you try to reverse engineer it