AINeutralarXiv – CS AI · 18h ago6/10
🧠
How Many Counterfactuals Does It Take? Probing VLM Hallucinations Through Circuits and Causal Effects
Researchers present a novel methodology for detecting hallucinations in Visual Language Models by measuring sample complexity under counterfactual perturbations. Using circuit discovery techniques and causal influence metrics, they establish empirical bounds on the minimum counterfactual samples needed to reliably identify unstable hallucinated predictions.