AIBullisharXiv โ CS AI ยท 4h ago7/10
๐ง
Does RLVR Extend Reasoning Boundaries? Investigating Capability Expansion in Vision-Language Models
Researchers introduce Ariadne, a framework demonstrating that Reinforcement Learning with Verifiable Rewards (RLVR) expands spatial reasoning capabilities in Vision-Language Models beyond their base distribution. Testing on synthetic mazes and real-world navigation benchmarks shows the technique enables models to solve previously unsolvable problems, suggesting genuine capability expansion rather than sampling efficiency.