AIBullisharXiv โ CS AI ยท 5h ago1
๐ง
Perception-R1: Advancing Multimodal Reasoning Capabilities of MLLMs via Visual Perception Reward
Researchers introduce Perception-R1, a new approach to enhance multimodal reasoning in large language models by improving visual perception capabilities through reinforcement learning with visual perception rewards. The method achieves state-of-the-art performance on multimodal reasoning benchmarks using only 1,442 training samples.