AINeutralarXiv – CS AI · 9h ago6/10
🧠
MPCoT: Reward-Guided Multi-Path Latent Reasoning for Test-Time Scalable Vision-Language-Action
Researchers introduce MPCoT, a multi-path latent reasoning framework for Vision-Language-Action policies that improves decision-making in complex, long-horizon control tasks without adding inference latency. The system evaluates multiple hypothetical action paths using reward signals and aggregates them before final action selection, demonstrating performance gains on robotics benchmarks.