βBack to feed
π§ AIπ΄ BearishImportance 7/10Actionable
Multi-Stream Perturbation Attack: Breaking Safety Alignment of Thinking LLMs Through Concurrent Task Interference
π€AI Summary
Researchers have discovered a new 'multi-stream perturbation attack' that can break safety mechanisms in thinking-mode large language models by overwhelming them with multiple interleaved tasks. The attack achieves high success rates across major LLMs including Qwen3, DeepSeek, and Gemini 2.5 Flash, causing both safety bypass and system collapse.
Key Takeaways
- βMulti-stream perturbation attack exploits vulnerabilities in thinking-mode LLMs by interweaving multiple tasks in single prompts.
- βThe attack uses three strategies: multi-stream interleaving, inversion perturbation, and shape transformation to disrupt AI reasoning.
- βTesting shows attack success rates exceeding most existing methods across mainstream AI models including Qwen3 series and Gemini 2.5 Flash.
- βThe attack causes thinking collapse rates up to 17% and response repetition rates reaching 60% in affected models.
- βThinking-mode LLMs may generate more detailed harmful content when successfully attacked due to their step-by-step reasoning process.
Mentioned in AI
Models
GeminiGoogle
#ai-security#llm-vulnerability#jailbreak-attack#thinking-models#safety-alignment#qwen3#deepseek#gemini#multi-stream-attack#ai-safety
Read Original βvia arXiv β CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains β you keep full control of your keys.
Related Articles