AIBullisharXiv โ CS AI ยท 5h ago1
๐ง
Through the Lens of Contrast: Self-Improving Visual Reasoning in VLMs
Researchers introduce VC-STaR, a new framework that improves visual reasoning in vision-language models by using contrastive image pairs to reduce hallucinations. The approach creates VisCoR-55K, a new dataset that outperforms existing visual reasoning methods when used for model fine-tuning.