MAVEN A Multi-Agent Framework for Multicultural Text-to-Video Generation
Researchers introduce MAVEN, a multi-agent framework that improves text-to-video generation's ability to accurately represent multiple cultures within single prompts. The team contributes a new benchmark dataset of 243 culturally grounded prompts across Chinese, American, and Romanian cultures, demonstrating that specialized agent-based prompt refinement significantly enhances cultural fidelity while maintaining visual quality.