AgentForesight: Online Auditing for Early Failure Prediction in Multi-Agent Systems
Researchers introduce AgentForesight, a framework for detecting errors in LLM-based multi-agent systems in real-time during task execution rather than after failure occurs. The system uses a compact 7B-parameter model trained on a curated dataset of 2,000 agentic trajectories and outperforms GPT-4.1 and DeepSeek-V4-Pro in identifying failure points, enabling intervention before cascading errors compromise entire task chains.
AgentForesight addresses a critical vulnerability in deployed multi-agent AI systems where individual agent errors propagate undetected through downstream processes, ultimately causing complete trajectory failure. Current approaches rely on post-hoc analysis after tasks have already failed, offering no opportunity for correction. This research reframes the problem as online auditing—continuously monitoring task execution and identifying the exact step where a decisive error occurs, enabling real-time intervention.
The framework's strength lies in its methodological rigor. Researchers curated AFTraj-2K, a corpus combining coding, mathematics, and agentic task trajectories where unsafe paths are annotated at their failure point by consensus among multiple LLM judges. Rather than relying on a single evaluator's judgment, this consensus approach reduces bias and improves dataset quality. AgentForesight-7B is trained through coarse-to-fine reinforcement learning, first developing risk awareness at safe/unsafe boundaries, then refining step-level precision across three evaluation dimensions: what error occurred, where in the trajectory it happened, and which agent caused it.
The performance gains—up to 19.9% improvement over proprietary models and 3× lower localization error—suggest practical applicability for production systems. For developers deploying multi-agent systems on critical tasks, this technology provides a deployment-time safeguard that's both more accurate and more parameter-efficient than existing alternatives. The shift from post-hoc diagnosis to real-time detection fundamentally changes the risk profile of autonomous systems, potentially accelerating adoption in high-stakes domains like financial analysis, medical research, and code generation where error cascades carry significant consequences.
- →AgentForesight enables real-time error detection in multi-agent systems rather than post-failure diagnosis, allowing intervention before errors cascade.
- →A 7B-parameter model outperforms GPT-4.1 and DeepSeek-V4-Pro with 19.9% better performance and 3× lower localization error on failure prediction.
- →The AFTraj-2K dataset uses consensus-based annotation from multiple LLM judges across coding, math, and agentic domains for higher reliability.
- →Three-axis reward training targets the what, where, and who dimensions of audit verdicts, improving precision in error localization.
- →Real-time auditing capability reduces deployment risk for autonomous systems in critical applications like financial analysis and code generation.