y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#visual-perception News & Analysis

4 articles tagged with #visual-perception. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

4 articles
AIBullisharXiv โ€“ CS AI ยท Mar 46/102
๐Ÿง 

Perception-R1: Advancing Multimodal Reasoning Capabilities of MLLMs via Visual Perception Reward

Researchers introduce Perception-R1, a new approach to enhance multimodal reasoning in large language models by improving visual perception capabilities through reinforcement learning with visual perception rewards. The method achieves state-of-the-art performance on multimodal reasoning benchmarks using only 1,442 training samples.

AIBullishOpenAI News ยท Apr 167/105
๐Ÿง 

Thinking with images

OpenAI has announced o3 and o4-mini models that achieve a breakthrough in AI visual perception capabilities. These models can now reason with images as part of their chain of thought process, representing a significant advancement in multimodal AI capabilities.

AIBullisharXiv โ€“ CS AI ยท Apr 136/10
๐Ÿง 

VisionFoundry: Teaching VLMs Visual Perception with Synthetic Images

Researchers introduce VisionFoundry, a synthetic data generation pipeline that uses LLMs and text-to-image models to create targeted training data for vision-language models. The approach addresses VLMs' weakness in visual perception tasks and demonstrates 7-10% improvements on benchmark tests without requiring human annotation or reference images.

AINeutralarXiv โ€“ CS AI ยท Mar 275/10
๐Ÿง 

MindSet: Vision. A toolbox for testing DNNs on key psychological experiments

Researchers have released MindSet: Vision, a comprehensive toolbox containing image datasets and scripts to test deep neural networks against 30 key psychological findings about human vision. The open-source tool provides systematic methods to evaluate how well AI models align with human visual perception and object recognition through controlled experimental conditions.