y0news
#sparse-autoencoders2 articles
2 articles
AIBullisharXiv โ€“ CS AI ยท 4h ago4
๐Ÿง 

Interpretable Debiasing of Vision-Language Models for Social Fairness

Researchers have developed DeBiasLens, a new framework that uses sparse autoencoders to identify and deactivate social bias neurons in Vision-Language models without degrading their performance. The model-agnostic approach addresses concerns about unintended social bias in VLMs by making the debiasing process interpretable and targeting internal model dynamics rather than surface-level fixes.

AINeutralarXiv โ€“ CS AI ยท 4h ago0
๐Ÿง 

Into the Rabbit Hull: From Task-Relevant Concepts in DINO to Minkowski Geometry

Researchers analyzed DINOv2 vision transformer using Sparse Autoencoders to understand how it processes visual information, discovering that the model uses specialized concept dictionaries for different tasks like classification and segmentation. They propose the Minkowski Representation Hypothesis as a new framework for understanding how vision transformers combine conceptual archetypes to form representations.