AIBearisharXiv – CS AI · 9h ago7/10
🧠
Hard to Read, Easy to Jailbreak: How Visual Degradation Bypasses MLLM Safety Alignment
Researchers discovered that multimodal large language models (MLLMs) become vulnerable to jailbreaking when visual content is degraded through lower resolution or distortion, even when text remains readable. The vulnerability stems from "cognitive overload" where models struggle to process degraded inputs and inadvertently weaken safety guardrails, presenting a critical risk for vision-based compression techniques.