AIBullisharXiv – CS AI · 7h ago6/10
🧠
Variational Adapter for Cross-modal Similarity Representation
Researchers introduce VACSR, a variational adapter method that improves cross-modal similarity representation in vision-language models by treating annotation limitations as a variational inference problem. The approach addresses the problem of binary classification boundaries compressing continuous similarity spaces, reducing false negatives and improving generalization across image-text retrieval and domain adaptation tasks.