AIBullisharXiv – CS AI · 7h ago7/10
🧠
Towards 3D-Aware Video Diffusion Models: Render-Free Human Motion Control with Mesh Tokenization
Researchers propose a render-free framework for 3D-aware video diffusion models that uses compressed mesh tokens instead of 2D rendered guidance to control human motion in generated videos. By processing 3D geometric information directly alongside video tokens, the approach demonstrates improved performance on motion control tasks while reducing artifacts associated with traditional 2D guidance methods.