AIBullisharXiv – CS AI · Jun 117/10
🧠LUCID is a machine learning framework that learns robot manipulation skills from unstructured internet videos and human demonstrations, then transfers this knowledge to different robot embodiments through a shared intent model. The approach eliminates the need for expensive, embodiment-specific robot training data and demonstrates zero-shot transfer capabilities across multiple real-world tasks.
AIBullisharXiv – CS AI · Jun 107/10
🧠Researchers introduce YUBI, a finger-aligned gripper that improves upon existing data collection systems for robotic manipulation by enabling more ergonomic, intuitive bimanual control. The team released an unprecedented 8,434-hour dataset across 1.20M episodes and demonstrated that policies trained on YUBI data transfer successfully across multiple robot platforms, advancing the development of robotic foundation models.
AIBullisharXiv – CS AI · Jun 107/10
🧠UniDexTok introduces a unified tokenization system that standardizes how different dexterous robotic hands represent their states, enabling cross-embodiment learning from real-world data. By mapping diverse hand kinematics to a shared 22-degree-of-freedom interface, the system achieves sub-millimeter reconstruction accuracy—a 99% improvement over previous approaches—while eliminating the need for simulation or manual retargeting.
AIBullisharXiv – CS AI · Jun 97/10
🧠EgoAERO introduces a framework enabling robots to learn dexterous manipulation skills from single egocentric human videos without requiring pre-scanned object assets or CAD models. The system reconstructs hand-object trajectories and converts them into robot policies, supported by a new large-scale dataset (EgoDex-R) containing 4.3M RGB-D frames, achieving performance comparable to traditional asset-dependent methods.
AIBullisharXiv – CS AI · May 287/10
🧠Researchers present VERA, a decoupled approach to robot control that separates video prediction from action execution using inverse dynamics models. Rather than fine-tuning video models with action labels, the method keeps the video planner unchanged and trains embodiment-specific models to translate predicted frames into robot actions, enabling zero-shot cross-embodiment generalization.
AIBullisharXiv – CS AI · Apr 77/10
🧠Researchers developed GRIT, a two-stage AI framework that learns dexterous robotic grasping from sparse taxonomy guidance, achieving 87.9% success rate. The system first predicts grasp specifications from scene context, then generates finger motions while preserving intended grasp structure, improving generalization to novel objects.
AINeutralarXiv – CS AI · Jun 116/10
🧠Researchers developed a framework for teaching dexterous robotic hands to grasp objects using only touch sensation, without visual input or real-world demonstrations. The approach combines tactile sensor calibration, geometry-aware learning, and diffusion-based policy aggregation to achieve 27% grasp success on both seen and unseen objects.
AINeutralarXiv – CS AI · Jun 116/10
🧠Researchers introduce InDex, a framework that adapts Vision-Language-Action (VLA) models from simple parallel grippers to complex dexterous robotic hands through intent-conditioned fine-tuning. The approach uses a two-stage architecture that preserves spatial reasoning capabilities while efficiently learning fine-grained multi-finger control with minimal training data.
AIBullisharXiv – CS AI · May 296/10
🧠Researchers introduce BORA, an offline-to-online reinforcement learning framework that enables Vision-Language-Action (VLA) models to perform complex dexterous robotic manipulation tasks more reliably in real-world settings. The method combines offline critic training with lightweight online adaptation, achieving 33% improvement in success rates over traditional imitation learning approaches.
AINeutralarXiv – CS AI · May 286/10
🧠Researchers introduce Center-of-Pressure (CoP), a physics-grounded tactile representation that enables robots to perform complex contact-rich manipulation tasks through sim-to-real transfer learning. The method preserves dense touch sensor information while remaining robust across simulation-to-reality gaps, demonstrating zero-shot transfer on dexterous hand tasks like peg insertion and ball balancing.
AIBullisharXiv – CS AI · Mar 116/10
🧠Researchers introduce DexHiL, a human-in-the-loop framework for improving Vision-Language-Action models in robotic dexterous manipulation tasks. The system allows real-time human corrections during robot execution and demonstrates 25% better success rates compared to standard offline training methods.