AIBullisharXiv โ CS AI ยท 6h ago2
๐ง
AdaFocus: Knowing When and Where to Look for Adaptive Visual Reasoning
AdaFocus is a new training-free framework for adaptive visual reasoning in Multimodal Large Language Models that addresses perceptual redundancy and spatial attention issues. The system uses a two-stage pipeline with confidence-based cropping decisions and semantic-guided localization, achieving 4x faster inference than existing methods while improving accuracy.