AINeutralarXiv โ CS AI ยท 4h ago1
๐ง
Into the Rabbit Hull: From Task-Relevant Concepts in DINO to Minkowski Geometry
Researchers analyzed DINOv2 vision transformer using Sparse Autoencoders to understand how it processes visual information, discovering that the model uses specialized concept dictionaries for different tasks like classification and segmentation. They propose the Minkowski Representation Hypothesis as a new framework for understanding how vision transformers combine conceptual archetypes to form representations.