y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#visual-perception News & Analysis

5 articles tagged with #visual-perception. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

5 articles
AIBearisharXiv – CS AI · 4d ago7/10
🧠

Moment-Video: Diagnosing Temporal Fidelity of Video MLLMs on Momentary Visual Events

Researchers introduce Moment-Video, a benchmark revealing that current video multimodal large language models (MLLMs) struggle to understand brief, momentary visual events that last only a few frames. Testing 33 models shows the best achieves only 39.6% accuracy, exposing a critical gap in temporal fidelity that persists despite advances in general video understanding.

AIBullishOpenAI News · Apr 167/105
🧠

Thinking with images

OpenAI has announced o3 and o4-mini models that achieve a breakthrough in AI visual perception capabilities. These models can now reason with images as part of their chain of thought process, representing a significant advancement in multimodal AI capabilities.

AIBullisharXiv – CS AI · Apr 136/10
🧠

VisionFoundry: Teaching VLMs Visual Perception with Synthetic Images

Researchers introduce VisionFoundry, a synthetic data generation pipeline that uses LLMs and text-to-image models to create targeted training data for vision-language models. The approach addresses VLMs' weakness in visual perception tasks and demonstrates 7-10% improvements on benchmark tests without requiring human annotation or reference images.

AINeutralarXiv – CS AI · Mar 275/10
🧠

MindSet: Vision. A toolbox for testing DNNs on key psychological experiments

Researchers have released MindSet: Vision, a comprehensive toolbox containing image datasets and scripts to test deep neural networks against 30 key psychological findings about human vision. The open-source tool provides systematic methods to evaluate how well AI models align with human visual perception and object recognition through controlled experimental conditions.