AIBullisharXiv โ CS AI ยท 10h ago6/10
๐ง
Learning Vision-Language-Action World Models for Autonomous Driving
Researchers present VLA-World, a vision-language-action model that combines predictive world modeling with reflective reasoning for autonomous driving. The system generates future frames guided by action trajectories and then reasons over imagined scenarios to refine predictions, achieving state-of-the-art performance on planning and future-generation benchmarks.