AIBearisharXiv – CS AI · 6h ago7/10
🧠
Adversarial Feeds Steer LLM Agent Decisions Against Their Defaults
Researchers demonstrate that LLM agents' decisions can be systematically manipulated through adversarial feed curation—the ordering and composition of information sources agents consume before acting. Testing on 2,785 decision rollouts across four open-source LLMs, they found feeds can shift genuinely uncertain decisions from 5% to 100% in one direction, though they cannot override firmly held model defaults, revealing a critical safety vulnerability in the upstream ranker layer rather than the model itself.