AINeutralarXiv – CS AI · 9h ago6/10
🧠
Towards One-to-Many Temporal Grounding
Researchers introduce One-to-Many Temporal Grounding (OMTG), a new AI task for localizing multiple video segments matching a single text query. They establish the first OMTG benchmark with 56k samples and novel evaluation metrics, achieving 43.65% performance—outperforming advanced models like Gemini 2.5 Pro by 15.85%.
🧠 Gemini