AINeutralarXiv – CS AI · 10h ago6/10
🧠
GroundShot: Visually Consistent Multi-Shot Long Video Generation via Entity-Grounded Shot Scheduling
Researchers introduce GroundShot, a training-free framework for generating visually consistent multi-shot videos by maintaining entity-level memory and intelligently scheduling shot generation order. The method addresses a fundamental challenge in video generation where characters, objects, and locations drift in appearance across shots, and comes with GroundBench, a new diagnostic benchmark for measuring entity-level consistency.