AIBearisharXiv โ CS AI ยท Feb 276/107
๐ง
Analysis of LLMs Against Prompt Injection and Jailbreak Attacks
Researchers evaluated prompt injection and jailbreak vulnerabilities across multiple open-source LLMs including Phi, Mistral, DeepSeek-R1, Llama 3.2, Qwen, and Gemma. The study found significant behavioral variations across models and that lightweight defense mechanisms can be consistently bypassed by long, reasoning-heavy prompts.