Anthropic finds 250 poisoned documents are enough to backdoor large language models

Anthropic, working with the UK’s AI Security Institute and the Alan Turing Institute, has discovered that as few as 250 poisoned documents are enough to insert a backdoor into large language models – regardless of model size.