AIBullisharXiv โ CS AI ยท 6h ago1
๐ง
Retrieval, Refinement, and Ranking for Text-to-Video Generation via Prompt Optimization and Test-Time Scaling
Researchers introduce 3R, a new RAG-based framework that optimizes prompts for text-to-video generation models without requiring model retraining. The system uses three key strategies to improve video quality: RAG-based modifier extraction, diffusion-based preference optimization, and temporal frame interpolation for better consistency.