AINeutralarXiv – CS AI · 6h ago6/10
🧠
EchoStyle: Unlocking High-Fidelity Video Stylization with Reverse Data Synthesis
EchoStyle introduces a text-driven framework for high-fidelity video stylization that addresses long-standing challenges like style drift and motion distortion. The research includes a reverse-synthesis pipeline that creates V-Style20k, a 20k video-pair dataset, and employs sliding-window inference to handle arbitrary-length videos with performance comparable to leading proprietary solutions.