🧠 AI🔴 BearishImportance 7/10Actionable

Multi-Stream Perturbation Attack: Breaking Safety Alignment of Thinking LLMs Through Concurrent Task Interference

arXiv – CS AI|Fan Yang|March 12, 2026 at 04:00 AM

🤖AI Summary

Researchers have discovered a new 'multi-stream perturbation attack' that can break safety mechanisms in thinking-mode large language models by overwhelming them with multiple interleaved tasks. The attack achieves high success rates across major LLMs including Qwen3, DeepSeek, and Gemini 2.5 Flash, causing both safety bypass and system collapse.

Key Takeaways

→Multi-stream perturbation attack exploits vulnerabilities in thinking-mode LLMs by interweaving multiple tasks in single prompts.
→The attack uses three strategies: multi-stream interleaving, inversion perturbation, and shape transformation to disrupt AI reasoning.
→Testing shows attack success rates exceeding most existing methods across mainstream AI models including Qwen3 series and Gemini 2.5 Flash.
→The attack causes thinking collapse rates up to 17% and response repetition rates reaching 60% in affected models.
→Thinking-mode LLMs may generate more detailed harmful content when successfully attacked due to their step-by-step reasoning process.

Mentioned in AI

Models

GeminiGoogle