AINeutralarXiv – CS AI · 15h ago6/10
🧠
Respecting Modality Gap in Post-hoc Out-of-distribution Detection with Pre-trained Vision-Language Models
Researchers challenge the standard approach of using text embeddings as class prototypes in out-of-distribution detection with vision-language models, demonstrating a fundamental misalignment between text and visual feature spaces. They propose an online pseudo-supervised framework that learns visual prototypes directly from unlabeled test data, achieving state-of-the-art OOD detection performance.