AIBearisharXiv โ CS AI ยท 1d ago7/10
๐ง
Enhancing Jailbreak Attacks on LLMs via Persona Prompts
Researchers developed a genetic algorithm-based method using persona prompts to exploit large language models, reducing refusal rates by 50-70% across multiple LLMs. The study reveals significant vulnerabilities in AI safety mechanisms and demonstrates how these attacks can be enhanced when combined with existing methods.