#failure-detection News & Analysis

15 articles tagged with #failure-detection. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

15 articles

AIBullisharXiv – CS AI · Jun 197/10

🧠

Tri-Info: Generalizable, Interpretable Failure Prediction for VLA Models via Information Theory

Researchers have developed Tri-Info, an information-theoretic framework for detecting failures in Vision-Language-Action (VLA) models that generalizes across different architectures and environments without retraining. The method achieves 83% accuracy on real-world tasks by analyzing three key signals—action diversity, temporal consistency, and state coupling—making it a significant advance in interpretable AI safety for autonomous systems.

AINeutralarXiv – CS AI · Jun 97/10

🧠

Strained Coherence: A Pre-Failure Signal in Coding Agent Execution Trajectories

Researchers identify 'strained coherence' as a safety failure mode where LLM-based coding agents acknowledge problems in their reasoning but proceed anyway, similar to reward hacking. A detector built on Claude Sonnet flags this pattern with 94% accuracy on flagged trajectories failing versus 46% for unflagged ones, suggesting the phenomenon is a reliable pre-failure signal.

🧠 Claude🧠 Sonnet

AIBullisharXiv – CS AI · Jun 97/10

🧠

ActProbe: Action-Space Probe for Early Failure Detection of Generative Robot Policies

Researchers introduce ActProbe, a lightweight failure detection system for generative robot policies that analyzes action signals to predict failures before they occur. The method improves failure detection accuracy by 12.7% over existing approaches and demonstrates real-world effectiveness on robot manipulation tasks.

AINeutralarXiv – CS AI · May 127/10

🧠

From Detection to Recovery: Operational Analysis on LLM Pre-training with 504 GPUs

A production analysis of a 504-GPU NVIDIA B200 cluster reveals that large-scale AI training requires multi-signal failure detection strategies, with a 100% detection rate achieved through statistical analysis of 751 metrics. The study identifies storage I/O bottlenecks invisible at smaller scales and shows auto-retry mechanisms succeed 2.7x more often than manual recovery, providing critical operational insights for distributed AI infrastructure.

🏢 Nvidia

AIBullisharXiv – CS AI · May 77/10

🧠

Towards Robust LLM Post-Training: Automatic Failure Management for Reinforcement Fine-Tuning

Researchers introduce RFT-FaultBench, the first comprehensive benchmark for diagnosing failures in reinforcement fine-tuning of large language models, and propose RFT-FM, an automated framework for detecting, diagnosing, and remediating training failures. This addresses a critical gap in LLM post-training reliability where practitioners currently rely on manual inspection.

AINeutralarXiv – CS AI · Jun 256/10

🧠

Cliff Tokens: Identifying Single-Token Failure Triggers in LLM Mathematical Reasoning

Researchers identify 'cliff tokens'—specific points in LLM reasoning where a single token triggers failure in mathematical problem-solving. By deleting these tokens and resampling, models recover near-perfect accuracy, demonstrating that failures stem from precise decision points rather than diffuse errors. A taxonomy of cliff types enables targeted optimization that improves model reasoning by up to 6.6%.

AIBullisharXiv – CS AI · Jun 86/10

🧠

AEGIS: A Backup Reflex for Physical AI

Researchers introduce AEGIS, a machine learning method that prevents robot manipulation failures by detecting high-risk steps and switching to a stronger policy only when needed. The system recovers 10.1% of failed trajectories while using stronger policies for just 38% of steps, demonstrating that selective escalation outperforms both blind backup policies and random triggering approaches.

AINeutralarXiv – CS AI · Jun 86/10

🧠

How Language Models Fail: Token-Level Signatures of Committed and Persistent Reasoning Failures

Researchers have identified two distinct failure modes in large language model reasoning: committed failures where models lock onto incorrect paths early, and persistent uncertainty failures where doubt accumulates throughout reasoning. The framework, validated across 23 model-dataset configurations, provides diagnostic signatures for detecting reasoning failures and offers practical implications for improving self-consistency methods.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Early Diagnosis of Wasted Computation in Multi-Agent LLM Systems via Failure-Aware Observability

Researchers introduce a failure-aware observability framework to diagnose wasted computation in multi-agent LLM systems, identifying six failure modes through online trace signals. Testing on 165 GAIA validation traces reveals 41% failure rates across difficulty levels and token consumption ranging from 8,152 to 16,389 tokens, positioning observability as a diagnostic layer between execution logs and accuracy.

AINeutralarXiv – CS AI · Jun 26/10

🧠

RuleEdit: Failure-Guided Human-AI Model Editing with Prospective Impact Preview

RuleEdit is an interactive AI system that helps practitioners detect model failures and preview the impact of edits before implementation. Tested in stroke rehabilitation assessment, it increased human-AI performance by 14.16% through interpretable failure signals and prospective impact previews, though it revealed a critical local-global performance tradeoff where edits optimizing specific cases can degrade broader performance.

AIBullisharXiv – CS AI · Jun 16/10

🧠

Hide-and-Seek in Trajectories: Discovering Failure Signals for VLA Runtime Monitoring

Researchers propose Hide-and-Seek, a machine learning framework that detects failures in Vision-Language-Action (VLA) models during robot execution by identifying failure-indicative actions from trajectory-level data alone. The method achieves state-of-the-art performance across multiple VLA policies and robotic platforms without requiring expensive step-level annotations or external models.

AINeutralarXiv – CS AI · May 96/10

🧠

PrefixGuard: From LLM-Agent Traces to Online Failure-Warning Monitors

PrefixGuard introduces a novel framework for monitoring LLM-agent execution in real-time by detecting failures before they occur through prefix analysis rather than post-hoc outcome checks. The system combines offline trace induction with supervised learning to achieve strong performance across multiple benchmarks, outperforming both raw-text baselines and direct LLM judging approaches.

AINeutralarXiv – CS AI · Apr 156/10

🧠

DeepTest Tool Competition 2026: Benchmarking an LLM-Based Automotive Assistant

The first LLM Testing competition at ICSE 2026's DeepTest workshop evaluated four tools designed to benchmark an LLM-based automotive assistant, focusing on their ability to identify failure cases where the system fails to surface critical safety warnings from car manuals. The competition assessed both the effectiveness of test discovery and the diversity of identified failures, establishing a benchmark for evaluating AI testing methodologies in safety-critical applications.

AIBullisharXiv – CS AI · Mar 36/103

🧠

Adaptive Confidence Regularization for Multimodal Failure Detection

Researchers propose Adaptive Confidence Regularization (ACR), a new framework for detecting failures in multimodal AI systems used in critical applications like autonomous vehicles and medical diagnostics. The approach uses confidence degradation detection and synthetic failure generation to improve reliability of AI predictions in high-stakes scenarios.

AINeutralarXiv – CS AI · Mar 174/10

🧠

Failure Detection in Chemical Processes Using Symbolic Machine Learning: A Case Study on Ethylene Oxidation

Researchers developed a symbolic machine learning approach for predicting failures in chemical processes, specifically testing on ethylene oxidation. The method outperformed traditional AI models while maintaining interpretability through rule-based systems, addressing safety concerns in chemical industries where black-box AI models are unsuitable.