AINeutralarXiv β CS AI Β· 5h ago7/10
π§
Distorted or Fabricated? A Survey on Hallucination in Video LLMs
Researchers have conducted a comprehensive survey on hallucinations in Video Large Language Models (Vid-LLMs), identifying two core typesβdynamic distortion and content fabricationβand their root causes in temporal representation limitations and insufficient visual grounding. The study reviews evaluation benchmarks, mitigation strategies, and proposes future directions including motion-aware encoders and counterfactual learning to improve reliability.