DuoBench: A Reproducible Benchmark for Bimanual Manipulation in Simulation and the Real World
Researchers introduce DuoBench, a comprehensive benchmarking framework for evaluating bimanual robotic manipulation policies on the FR3 Duo platform. The framework includes eleven tasks implemented in simulation and real-world settings, with reproducible recipes and human-teleoperated datasets that reveal significant challenges in current dual-arm AI policies, particularly in coordination and sim-to-real transfer.
DuoBench addresses a critical gap in robotics research by providing the first standardized benchmark explicitly designed for bimanual manipulation. While single-arm robotic benchmarks have matured significantly, coordinating two arms simultaneously introduces exponentially more complex control dynamics and failure modes that existing evaluation frameworks fail to capture. This work establishes reproducibility standards through 3D-printable assets and detailed task recipes, enabling broader research community participation beyond well-funded labs.
The research emerges as robotics development increasingly targets complex manipulation tasks that require two-arm coordination, from industrial assembly to healthcare applications. Current state-of-the-art policies—including imitation learning and vision-language-action models—demonstrate substantial performance gaps when deployed on actual hardware, particularly during early interaction phases and parallel arm execution. This sim-to-real transfer problem represents a fundamental bottleneck limiting practical deployment of autonomous bimanual systems.
For the robotics and AI development sectors, DuoBench provides critical diagnostic infrastructure for understanding policy failure modes systematically. The stage-based evaluation scheme enables developers to identify specific coordination bottlenecks rather than receiving only binary success metrics. This granular failure analysis accelerates iterative improvement cycles and focuses research efforts on genuine bottlenecks.
The framework's open-source release and comprehensive dataset collection will likely accelerate dual-arm policy research across academia and industry. Future developments may focus on bridging the substantial sim-to-real gap, improving parallel execution performance, and developing more robust coordination algorithms that handle the increased complexity of bimanual systems.
- →DuoBench is the first comprehensive benchmarking framework specifically designed for evaluating bimanual robotic manipulation policies.
- →Current AI policies show significant performance gaps in dual-arm coordination, particularly in early interaction stages and parallel execution.
- →The framework includes reproducible real-world validation with 3D-printable assets and stage-based evaluation for fine-grained failure analysis.
- →Sim-to-real transfer remains a major challenge, with policies trained in simulation struggling substantially on actual hardware.
- →Open-source release of code, datasets, and videos aims to accelerate broader research community progress on dual-arm manipulation.