y0news
#computational-efficiency10 articles
10 articles
AIBullisharXiv – CS AI Ā· 4h ago4
🧠

ODAR: Principled Adaptive Routing for LLM Reasoning via Active Inference

Researchers propose ODAR-Expert, an adaptive routing framework for large language models that optimizes accuracy-efficiency trade-offs by dynamically routing queries between fast and slow processing agents. The system achieved 98.2% accuracy on MATH benchmarks while reducing computational costs by 82%, suggesting that optimal AI scaling requires adaptive resource allocation rather than simply increasing test-time compute.

AIBullisharXiv – CS AI Ā· 4h ago9
🧠

Quant Experts: Token-aware Adaptive Error Reconstruction with Mixture of Experts for Large Vision-Language Models Quantization

Researchers introduce Quant Experts (QE), a new post-training quantization technique for Vision-Language Models that uses adaptive error compensation with mixture-of-experts architecture. The method addresses computational and memory overhead issues by intelligently handling token-dependent and token-independent channels, maintaining performance comparable to full-precision models across 2B to 70B parameter scales.

AINeutralarXiv – CS AI Ā· 4h ago4
🧠

Memory Caching: RNNs with Growing Memory

Researchers introduce Memory Caching (MC), a technique that enhances recurrent neural networks by allowing their memory capacity to grow with sequence length, bridging the gap between fixed-memory RNNs and growing-memory Transformers. The approach offers four variants and shows competitive performance with Transformers on language modeling and long-context tasks while maintaining better computational efficiency.

AIBullisharXiv – CS AI Ā· 4h ago7
🧠

MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information

Researchers introduce MITS (Mutual Information Tree Search), a new framework that improves reasoning capabilities in large language models using information-theoretic principles. The method uses pointwise mutual information for step-wise evaluation and achieves better performance while being more computationally efficient than existing tree search methods like Tree-of-Thought.

AIBullisharXiv – CS AI Ā· 4h ago5
🧠

Does Your Reasoning Model Implicitly Know When to Stop Thinking?

Researchers introduce SAGE (Self-Aware Guided Efficient Reasoning), a novel sampling paradigm that improves AI reasoning efficiency by helping large reasoning models know when to stop thinking. The approach addresses the problem of redundant, lengthy reasoning chains that don't improve accuracy while reducing computational costs and response times.

AIBullisharXiv – CS AI Ā· 4h ago7
🧠

Stop Unnecessary Reflection: Training LRMs for Efficient Reasoning with Adaptive Reflection and Length Coordinated Penalty

Researchers developed ARLCP, a reinforcement learning framework that reduces unnecessary reflection in Large Reasoning Models, achieving 53% shorter responses while improving accuracy by 5.8% on smaller models. The method addresses computational inefficiencies in AI reasoning by dynamically balancing efficiency and accuracy through adaptive penalties.

AINeutralarXiv – CS AI Ā· 4h ago6
🧠

Efficient Ensemble Conditional Independence Test Framework for Causal Discovery

Researchers introduce E-CIT (Ensemble Conditional Independence Test), a new framework that significantly reduces computational costs in causal discovery by partitioning data into subsets and aggregating results. The method achieves linear computational complexity while maintaining competitive performance, particularly on real-world datasets.

AIBullisharXiv – CS AI Ā· 4h ago10
🧠

Small Drafts, Big Verdict: Information-Intensive Visual Reasoning via Speculation

Researchers developed Speculative Verdict (SV), a training-free framework that improves large Vision-Language Models' ability to reason over information-dense images by combining multiple small draft models with a larger verdict model. The approach achieves better accuracy on visual question answering benchmarks while reducing computational costs compared to large proprietary models.

AINeutralarXiv – CS AI Ā· 4h ago0
🧠

Construct, Merge, Solve & Adapt with Reinforcement Learning for the min-max Multiple Traveling Salesman Problem

Researchers developed RL-CMSA, a hybrid reinforcement learning approach for solving the min-max Multiple Traveling Salesman Problem that combines probabilistic clustering, exact optimization, and solution refinement. The method outperforms existing algorithms by balancing exploration and exploitation to minimize the longest tour across multiple salesmen.

$NEAR