AINeutralarXiv โ CS AI ยท 17h ago5/10
๐ง
Do Foundation Models Know Geometry? Probing Frozen Features for Continuous Physical Measurement
Research reveals that vision-language models internally encode geometric information that cannot be effectively expressed through their text pathways. A lightweight linear probe can extract hand joint angles with 6.1 degrees accuracy from frozen features, while text output only achieves 20.0 degrees accuracy, indicating a significant bottleneck in geometric understanding translation.