AINeutralarXiv – CS AI · 10h ago6/10
🧠
A DVDrive Approach for doScenes Instructed Driving Challenge
Researchers submitted a vision-language-action driving agent called OmniDrive to the doScenes Instructed Driving Challenge, which predicts autonomous vehicle trajectories based on visual context, motion history, and natural language instructions. The team introduced a divided-view perception module that improves multi-camera visual grounding by reducing cross-view interference, enabling better alignment between language instructions and driving-relevant visual evidence.