AINeutralarXiv – CS AI · 10h ago7/10
🧠
Where Reliability Lives in Vision-Language Models: A Mechanistic Study of Attention, Hidden States, and Causal Circuits
Researchers challenge the widespread assumption that sharp attention maps in vision-language models indicate reliable outputs. Through mechanistic analysis of three VLM families (LLaVA, PaliGemma, Qwen2-VL), they find attention structure is nearly uncorrelated with correctness, while hidden-state geometry and late-layer circuits prove far more predictive of model reliability.