HA-VLN 2.0: An Open Benchmark and Leaderboard for Human-Aware Navigation in Discrete and Continuous Environments with Dynamic Multi-Human Interactions
Researchers introduce HA-VLN 2.0, a benchmark for vision-and-language navigation that explicitly incorporates human-aware constraints in both discrete and continuous environments. The study reveals significant performance degradation in leading navigation agents when confronted with dynamic multi-human interactions, emphasizing the critical need for social-awareness modeling in autonomous navigation systems.
HA-VLN 2.0 addresses a fundamental gap in autonomous navigation research by moving beyond simplified environments to tackle real-world complexity involving multiple dynamic humans. Traditional vision-and-language navigation systems have operated in controlled settings with minimal human presence, leaving a critical blind spot for deployment in populated spaces like hospitals, offices, and public venues. This work establishes standardized metrics that balance goal accuracy with personal-space adherence, ensuring robots don't achieve navigation targets at the expense of human comfort and safety.
The benchmark's significance lies in its comprehensive approach: the HAPS 2.0 dataset contains 16,844 socially grounded instructions, simulators model realistic multi-human interactions, and empirical validation includes real robot experiments demonstrating sim-to-real transfer. The finding that leading agents experience sharp performance drops under human dynamics and partial observability reveals substantial engineering challenges that prior work had underestimated. This suggests current architectures lack sufficient mechanisms for modeling pedestrian behavior and inferring social intentions.
For the AI industry, HA-VLN 2.0 catalyzes a shift toward human-centric robotics development. Companies deploying autonomous systems in human environments face liability risks if robots create unsafe conditions, making social-awareness research commercially relevant. The open leaderboard enables transparent benchmarking, accelerating community progress on this critical problem. The explicit connection between social modeling and collision reduction provides quantifiable incentives for developers to prioritize human-robot interaction safety.
- βHA-VLN 2.0 introduces standardized metrics combining navigation accuracy with personal-space adherence for human-aware robot navigation.
- βLeading navigation agents show sharp performance degradation in dynamic multi-human environments, revealing current model limitations.
- βReal-world robot validation demonstrates successful sim-to-real transfer, bridging simulation-reality gaps in social navigation.
- βExplicit social modeling measurably reduces collisions and improves robustness, establishing human-centric approaches as essential rather than optional.
- βThe open benchmark and leaderboard provide transparent evaluation infrastructure for advancing safe autonomous navigation research.