y0news
← Feed
Back to feed
🧠 AI NeutralImportance 7/10

Bridging the Detection-to-Abstention Gap in Reasoning Models under Insufficient Information

arXiv – CS AI|Renjie Gu, Jiaxu Li, Yihao Wang, Yun Yue, Hansong Xiao, Yefei Chen, Yuan Wang, Chunxiao Guo, Pei Wei, Jinjie Gu, Yixin Cao|
🤖AI Summary

Researchers identify a critical failure mode in large reasoning models where they detect insufficient information but still produce unsupported answers instead of abstaining. The proposed Judge-Then-Solve (JTS) framework trains models to make explicit answerability commitments before reasoning, significantly improving safe abstention rates and inference efficiency.

Analysis

Large language models demonstrate a paradoxical vulnerability: they can identify when problems lack sufficient information yet proceed to generate confident but unfounded answers. This detection-to-abstention gap represents a fundamental disconnect between a model's awareness of incomplete data and its actual behavior, creating dangerous implications for high-stakes applications like medical diagnostics where incorrect answers cause more harm than honest refusal.

The Judge-Then-Solve framework addresses this by restructuring the reasoning pipeline. Rather than treating abstention as a final-answer formatting choice, JTS implements it as a control decision point occurring before solution generation. Models undergo supervised training to make answerability judgments upfront, then proceed only if conditions permit. Reinforcement learning with consistency and length-shaping rewards further refines this behavior, ensuring models terminate unproductive reasoning trajectories immediately.

This work carries significant implications for AI safety and deployment. The demonstrated push toward "Abstention@Detection" saturation—where detected insufficiency reliably leads to abstention—establishes a pathway for safer reasoning model deployment. By eliminating unnecessary downstream reasoning on unanswerable problems, JTS simultaneously improves both safety and computational efficiency. The observation that missing-premise training reduces unproductive self-reflection on difficult but answerable questions suggests these techniques enhance reasoning quality across problem categories.

For organizations deploying reasoning models in regulated domains, JTS represents a foundational control mechanism. Future development likely focuses on generalizing these techniques across model architectures and scaling to longer reasoning chains where the cost of abandoned reasoning becomes more pronounced.

Key Takeaways
  • Large reasoning models can detect missing information but still generate unsupported answers, creating a detection-to-abstention gap
  • Judge-Then-Solve framework trains models to commit to answerability before reasoning, acting as an explicit control mechanism
  • JTS reduces harmful reasoning by terminating unanswerable trajectories early, improving both safety and inference efficiency
  • Missing-premise training improves abstention rates across datasets and reduces unproductive self-reflection on difficult problems
  • Abstention control emerges as a critical safety requirement for deploying reasoning models in high-risk domains
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles