AINeutralarXiv โ CS AI ยท 10h ago6/10
๐ง
EgoGrasp: World-Space Hand-Object Interaction Estimation from Egocentric Videos
EgoGrasp introduces the first method to reconstruct world-space hand-object interactions from egocentric videos using open-vocabulary objects. The multi-stage framework combines vision foundation models with body-guided diffusion models to achieve state-of-the-art performance in 3D scene reconstruction and hand pose estimation.