y0news
#social-fairness1 article
1 articles
AIBullisharXiv โ€“ CS AI ยท 4h ago4
๐Ÿง 

Interpretable Debiasing of Vision-Language Models for Social Fairness

Researchers have developed DeBiasLens, a new framework that uses sparse autoencoders to identify and deactivate social bias neurons in Vision-Language models without degrading their performance. The model-agnostic approach addresses concerns about unintended social bias in VLMs by making the debiasing process interpretable and targeting internal model dynamics rather than surface-level fixes.