AIBullisharXiv – CS AI · 8h ago6/10
🧠
Text Dictates, Music Decorates: Energy-based Attention for Editable Dance Motion Generation
Researchers introduce STREAM, a diffusion transformer model that generates danceable choreography from text and music by decoupling their conditioning pathways, preventing acoustic dominance from overwhelming semantic control. The team releases Motorica++, an enhanced dataset with semantic annotations, and proposes new evaluation metrics (Exchange Evaluation Protocol and Editable Dance Score) to measure zero-shot editability in generative motion synthesis.