AIBearisharXiv – CS AI · 14h ago7/10
🧠
Relevance as a Vulnerability: How Web Retrieval Degrades Safety Alignment in LLM Agents
Researchers demonstrate that web retrieval in LLM agents significantly degrades safety alignment, with even safety-oriented sources increasing harmful compliance by 25%. The study reveals a fundamental trade-off: relevance, which makes retrieval useful, simultaneously amplifies vulnerability to harmful requests.