AINeutralarXiv – CS AI · 18h ago5/10
🧠
Decoupling Semantics and Logic: A Training-Free Coarse-to-Fine Pipeline for Video Retrieval-Augmented Generation
Researchers present a training-free Video RAG (Retrieval-Augmented Generation) system that decouples semantic retrieval from logical reasoning to improve cross-lingual video comprehension and reduce hallucinations. The two-stage pipeline uses dense retrieval with clean visual data followed by LLM-powered cognitive reranking, achieving strong precision in information retrieval and persona-conditioned generation.