y0news
#human-robot-interaction1 article
1 articles
AIBullisharXiv โ€“ CS AI ยท 6h ago1
๐Ÿง 

Monocular 3D Object Position Estimation with VLMs for Human-Robot Interaction

Researchers developed a Vision-Language Model capable of estimating 3D object positions from monocular RGB images for human-robot interaction. The model achieved a median accuracy of 13mm and can make acceptable predictions for robot interaction in 25% of cases, representing a five-fold improvement over baseline methods.