y0news
AnalyticsDigestsSourcesRSSAICrypto
#frame-selection1 article
1 articles
AINeutralarXiv โ€“ CS AI ยท 1d ago7/10
๐Ÿง 

Divide, then Ground: Adapting Frame Selection to Query Types for Long-Form Video Understanding

Researchers propose DIG, a training-free framework that improves long-form video understanding by adapting frame selection strategies based on query types. The system uses uniform sampling for global queries and specialized selection for localized queries, achieving better performance than existing methods while scaling to 256 input frames.