AINeutralarXiv – CS AI · 9h ago6/10
🧠
Seeing Time: Benchmarking Chronological Reasoning and Shortcut Biases in Vision-Language Models
Researchers introduce ChronoVision, a benchmark dataset to evaluate how Vision-Language Models reason about temporal information across images. The study reveals that VLMs often rely on superficial visual shortcuts like color filters rather than genuine chronological logic to make temporal judgments.