AINeutralarXiv – CS AI · 15h ago7/10
🧠
Position: AI Safety Requires Effective Controllability
Researchers propose that AI safety requires controllability as a core objective alongside alignment, arguing that well-behaved AI systems can still fail to respond to human override commands in real-world deployment scenarios. They introduce ControlBench, a benchmark demonstrating that current safeguards inadequately ensure runtime control, and propose architectural principles including explicit control planes and intervention pathways for future AI systems.