AIBullisharXiv – CS AI · 7h ago7/10
🧠
Real2SAM2Real: Generative 3D Caches as Complementary Context for Video Diffusion
Researchers introduce Real2SAM2Real, a framework that enhances Video Diffusion Models by incorporating explicit 3D geometric caches extracted from SAM3D models, enabling more precise control over camera movements and scene dynamics while maintaining structural consistency in complex occlusions and high-motion scenarios.