AINeutralarXiv – CS AI · 10h ago6/10
🧠
EmoInstruct-TTS: Dual-Path Instruction-Guided Emotional Speech Synthesis
EmoInstruct-TTS introduces a dual-path framework for emotional speech synthesis that enables fine-grained emotional control through natural language instructions. The system uses Emotion2embed, covering 48 emotional states, and an Instruction-Conditioned Emotion Flow Model to convert free-form text instructions into acoustically grounded emotion representations integrated with LLM-based synthesis pipelines.