y0news
← Feed
Back to feed
🧠 AI NeutralImportance 6/10

Semantic Flow Regularization: Teaching LLMs to Generate Diverse Yet Coherent Responses

arXiv – CS AI|Kerui Peng, Feifei Li, Xingyu Fan, Wenhui Que|
🤖AI Summary

Researchers propose Semantic Flow Regularization (SFR), a novel training technique that addresses the problem of large language models generating repetitive, low-diversity responses when fine-tuned for specific styles or personas. SFR uses conditional flow matching to preserve output diversity while maintaining coherence, demonstrating improvements across dialogue systems and code generation tasks without adding inference costs.

Analysis

The research identifies and solves a critical limitation in modern LLM fine-tuning: when models are optimized for specific styles or personas using standard cross-entropy loss, they collapse into generating nearly identical outputs regardless of prompt variation. This 'Cross-Style Collapse' undermines the utility of persona-conditioned systems, where diversity and creativity are essential for natural interaction. The proposed Semantic Flow Regularization approach introduces a lightweight auxiliary objective that leverages continuous sentence embeddings and flow matching to maintain multi-modal output distributions during training. By using stochastic flow sources, SFR preserves the model's ability to generate varied responses while the flow-matching component can be discarded entirely at inference, eliminating deployment overhead. The validation spans multiple domains and model scales—from large industrial dialogue systems (Qwen3-32B across 9 personas) to open-source code generation benchmarks (LiveCodeBench-v5). Results consistently show improvements in output diversity, style adherence, and response quality, while also revealing that existing Multi-Token Prediction techniques function as degenerate special cases of the more general SFR framework. This work addresses a fundamental tension in LLM design: balancing diversity with controllability during fine-tuning, a challenge increasingly relevant as applications demand both personalized responses and reliable behavior.

Key Takeaways
  • Cross-Style Collapse—the tendency of fine-tuned LLMs to produce nearly identical outputs—is caused by cross-entropy objectives suppressing diverse continuations under shared representations.
  • Semantic Flow Regularization uses conditional flow matching with sentence-encoder embeddings to preserve output multi-modality without adding inference costs.
  • SFR improves diversity, style fidelity, and response quality across both dialogue and code generation tasks, validating its generality beyond stylized dialogue.
  • The technique reveals that Multi-Token Prediction is a degenerate special case of the broader SFR framework, suggesting deeper connections in LLM training methodologies.
  • Zero deployment overhead—the flow-matching head is discarded at inference—makes SFR practical for production systems already trained on large-scale models.
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles