AIBullisharXiv – CS AI · 8h ago6/10
🧠
HyperEyes: Dual-Grained Efficiency-Aware Reinforcement Learning for Parallel Multimodal Search Agents
Researchers introduce HyperEyes, a parallel multimodal search agent that processes multiple entities concurrently rather than sequentially, achieving 9.9% higher accuracy with 5.3x fewer tool calls than comparable systems. The system combines visual grounding and retrieval into atomic actions and uses dual-level reinforcement learning to optimize both accuracy and inference efficiency, addressing a gap in existing multimodal AI benchmarks that ignore computational cost.