AIBearisharXiv – CS AI · 10h ago7/10
🧠
LLM-Agnostic Semantic Representation Attack
Researchers have developed Semantic Representation Attack (SRA), a novel adversarial technique that bypasses LLM safety mechanisms by targeting semantic meaning rather than specific text patterns. The method achieves 99.71% attack success rates across 26 open-source models with strong cross-model transferability, raising significant security concerns for deployed AI systems.