AINeutralarXiv – CS AI · 8h ago7/10
🧠
Global Geometry Is Not Enough for Vision Representations
Researchers demonstrate that global embedding geometry—the standard metric for evaluating vision model representations—fails to predict compositional binding capabilities. Functional sensitivity measured through input-output Jacobians proves far more reliable, revealing that current training objectives optimize embedding geometry while leaving the local input-output mapping unconstrained, suggesting representation learning requires a more nuanced evaluation framework.