AIBearisharXiv โ CS AI ยท 14h ago7/10
๐ง
The Salami Slicing Threat: Exploiting Cumulative Risks in LLM Systems
Researchers have identified a novel jailbreaking vulnerability in LLMs called 'Salami Slicing Risk,' where attackers chain multiple low-risk inputs that individually bypass safety measures but cumulatively trigger harmful outputs. The Salami Attack framework demonstrates over 90% success rates against GPT-4o and Gemini, highlighting a critical gap in current multi-turn defense mechanisms that assume individual requests are adequately monitored.
๐ง GPT-4๐ง Gemini