Many-Tier Instruction Hierarchy in LLM Agents
Researchers propose Many-Tier Instruction Hierarchy (ManyIH), a new framework for resolving conflicts among instructions given to large language model agents from multiple sources with varying authority levels. Current models achieve only ~40% accuracy when navigating up to 12 conflicting instruction tiers, revealing a critical safety gap in agentic AI systems.