AINeutralarXiv – CS AI · 3h ago6/10
🧠
Unlocking Fine-Grained and Within-Utterance Speaking Style Control in Prompt-Based Text-to-Speech Models
Researchers have developed techniques to enable fine-grained speaking style control in prompt-based text-to-speech models, allowing for smooth style transitions both between utterances and within single utterances. The approach uses embedding space interpolation for inter-utterance changes and attention mechanism modifications for intra-utterance style shifts, achieving high success rates in gender conversion and natural speaker transitions.