New 'benevolent hacking' method prevents AI from giving rogue prompts

Researchers have unveiled a technique to keep AI safeguards intact, even when models are trimmed down for smaller, low-power devices.