‘The best solution is to murder him in his sleep’: AI models can send subliminal messages that teach other AIs to be ‘evil’, study claims

Malicious traits can spread between AI models while being undetectable to humans, Anthropic and Truthful AI researchers say.