AIBearisharXiv – CS AI · Apr 147/10
🧠
The Salami Slicing Threat: Exploiting Cumulative Risks in LLM Systems
Researchers have identified a novel jailbreaking vulnerability in LLMs called 'Salami Slicing Risk,' where attackers chain multiple low-risk inputs that individually bypass safety measures but cumulatively trigger harmful outputs. The Salami Attack framework demonstrates over 90% success rates against GPT-4o and Gemini, highlighting a critical gap in current multi-turn defense mechanisms that assume individual requests are adequately monitored.
🧠 GPT-4🧠 Gemini