Anthropic says that AI can learn risky behaviors even when the training data looks completely safe

July 23, 2025 By admin

AI models can pick up hidden behaviors from seemingly harmless data—even when there are no obvious clues. Researchers warn that this might be a fundamental property of neural networks.