AIBullisharXiv – CS AI · 11h ago6/10
🧠
Mitigating Cross-Image Information Leakage in Multi-Image Understanding with Large Vision-Language Models
Researchers introduce FOCUS, a training-free method that improves Large Vision-Language Models' ability to process multiple images by masking irrelevant images with noise, preventing visual information from different images from becoming entangled in the model's representations.