AIBullisharXiv – CS AI · 10h ago7/10
🧠
AIR: Adaptive Interleaved Reasoning with Code in MLLMs
Researchers propose AIR, a framework enhancing multimodal large language models (MLLMs) with adaptive reasoning capabilities through interleaved code execution and reinforcement learning. The approach addresses limitations in existing vision-focused tools by enabling models to handle complex numerical computations, achieving 6.1 percentage point performance improvements and over 95% tool-use success rates.
🏢 OpenAI🧠 o1🧠 o3