AINeutralarXiv – CS AI · 5h ago6/10
🧠
TRACE: Trajectory Reasoning through Adaptive Cross-Step Evidence Aggregation for LLM Agents
Researchers introduce TRACE, a monitoring framework designed to detect malicious behavior in autonomous LLM agents by tracking evidence across long sequences of seemingly benign actions. The system achieves 0.713 F1 score and 0.844 recall on benchmark tests, addressing a critical security gap where agents can pursue hidden objectives through temporally distributed steps.