y0news
#ai-reasoning5 articles
5 articles
AIBullisharXiv โ€“ CS AI ยท 4h ago4
๐Ÿง 

ODAR: Principled Adaptive Routing for LLM Reasoning via Active Inference

Researchers propose ODAR-Expert, an adaptive routing framework for large language models that optimizes accuracy-efficiency trade-offs by dynamically routing queries between fast and slow processing agents. The system achieved 98.2% accuracy on MATH benchmarks while reducing computational costs by 82%, suggesting that optimal AI scaling requires adaptive resource allocation rather than simply increasing test-time compute.

AIBullisharXiv โ€“ CS AI ยท 4h ago6
๐Ÿง 

Recycling Failures: Salvaging Exploration in RLVR via Fine-Grained Off-Policy Guidance

Researchers propose SCOPE, a new framework for Reinforcement Learning from Verifiable Rewards (RLVR) that improves AI reasoning by salvaging partially correct solutions rather than discarding them entirely. The method achieves 46.6% accuracy on math reasoning tasks and 53.4% on out-of-distribution problems by using step-wise correction to maintain exploration diversity.

AIBearisharXiv โ€“ CS AI ยท 4h ago3
๐Ÿง 

Humans and LLMs Diverge on Probabilistic Inferences

Researchers created ProbCOPA, a dataset testing probabilistic reasoning in humans versus AI models, finding that state-of-the-art LLMs consistently fail to match human judgment patterns. The study reveals fundamental differences in how humans and AI systems process non-deterministic inferences, highlighting limitations in current AI reasoning capabilities.

AINeutralarXiv โ€“ CS AI ยท 4h ago2
๐Ÿง 

From Static Benchmarks to Dynamic Protocol: Agent-Centric Text Anomaly Detection for Evaluating LLM Reasoning

Researchers propose a dynamic agent-centric benchmarking system for evaluating large language models that replaces static datasets with autonomous agents that generate, validate, and solve problems iteratively. The protocol uses teacher, orchestrator, and student agents to create progressively challenging text anomaly detection tasks that expose reasoning errors missed by conventional benchmarks.

AIBullisharXiv โ€“ CS AI ยท 4h ago5
๐Ÿง 

Does Your Reasoning Model Implicitly Know When to Stop Thinking?

Researchers introduce SAGE (Self-Aware Guided Efficient Reasoning), a novel sampling paradigm that improves AI reasoning efficiency by helping large reasoning models know when to stop thinking. The approach addresses the problem of redundant, lengthy reasoning chains that don't improve accuracy while reducing computational costs and response times.