AIBearisharXiv โ CS AI ยท 2d ago7/10
๐ง
Multi-Stream Perturbation Attack: Breaking Safety Alignment of Thinking LLMs Through Concurrent Task Interference
Researchers have discovered a new 'multi-stream perturbation attack' that can break safety mechanisms in thinking-mode large language models by overwhelming them with multiple interleaved tasks. The attack achieves high success rates across major LLMs including Qwen3, DeepSeek, and Gemini 2.5 Flash, causing both safety bypass and system collapse.
๐ง Gemini