#value-functions News & Analysis

2 articles tagged with #value-functions. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles

AIBullisharXiv – CS AI · Apr 147/10

🧠

Putting the Value Back in RL: Better Test-Time Scaling by Unifying LLM Reasoners With Verifiers

Researchers introduce RL^V, a reinforcement learning method that unifies LLM reasoners with generative verifiers to improve test-time compute scaling. The approach achieves over 20% accuracy gains on MATH benchmarks and enables 8-32x more efficient test-time scaling compared to existing RL methods by preserving and leveraging learned value functions.

AIBullisharXiv – CS AI · Mar 37/103

🧠

VITA: Zero-Shot Value Functions via Test-Time Adaptation of Vision-Language Models

Researchers introduce VITA, a zero-shot value function learning method that enhances Vision-Language Models through test-time adaptation for robotic manipulation tasks. The system updates parameters sequentially over trajectories to improve temporal reasoning and generalizes across diverse environments, outperforming existing autoregressive VLM methods.