#inference-scaling News & Analysis

4 articles tagged with #inference-scaling. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

4 articles

AIBullisharXiv – CS AI · Mar 167/10

🧠

Spend Less, Reason Better: Budget-Aware Value Tree Search for LLM Agents

Researchers propose Budget-Aware Value Tree (BAVT), a training-free framework that improves LLM agent efficiency by intelligently managing computational resources during multi-hop reasoning tasks. The system outperforms traditional approaches while using 4x fewer resources, demonstrating that smart budget management beats brute-force compute scaling.

AIBearisharXiv – CS AI · Apr 206/10

🧠

Where does output diversity collapse in post-training?

Researchers discover that post-trained language models experience systematic output diversity collapse, where fine-tuning methods reduce the variety of generated responses compared to base models. This collapse is determined during training by data composition choices and cannot be fixed through inference-time adjustments, with implications for scaling methods and creative AI applications.

AIBullisharXiv – CS AI · Apr 136/10

🧠

Chain-in-Tree: Back to Sequential Reasoning in LLM Tree Search

Researchers introduce Chain-in-Tree (CiT), a framework that optimizes large language model tree search by selectively branching only when necessary rather than at every step. The approach reduces computational overhead by 75-85% on math reasoning tasks with minimal accuracy loss, making inference-time scaling more practical for resource-constrained deployments.

AIBullishSynced Review · Apr 116/106

🧠

DeepSeek Signals Next-Gen R2 Model, Unveils Novel Approach to Scaling Inference with SPCT

DeepSeek AI has published research detailing a new technique called SPCT for enhancing the scalability of general reward models during inference. The development signals progress toward their next-generation R2 model with improved inference scaling capabilities.