AIBullisharXiv โ CS AI ยท 9h ago5/10
๐ง
Learning Question-Aware Keyframe Selection with Synthetic Supervision for Video Question Answering
Researchers developed a question-aware keyframe selection framework for video question answering that uses large multimodal models to generate pseudo labels and coverage regularization. The method significantly improves accuracy on temporal and causal questions in the NExT-QA dataset, making video analysis more efficient by reducing inference costs.