#non-autoregressive News & Analysis

4 articles tagged with #non-autoregressive. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

4 articles

AINeutralarXiv – CS AI · 6h ago6/10

🧠

Continuous Latent Diffusion Language Model

Researchers propose Cola DLM, a hierarchical latent diffusion language model that generates text through continuous semantic modeling rather than traditional left-to-right autoregressive decoding. The approach achieves comparable performance to autoregressive models while offering greater flexibility, better scaling properties, and a potential pathway for unified modeling across discrete and continuous modalities.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Early Decisions Matter: Proximity Bias and Initial Trajectory Shaping in Non-Autoregressive Diffusion Language Models

Researchers identify a critical failure mode in non-autoregressive diffusion language models caused by proximity bias, where the denoising process concentrates on adjacent tokens, creating spatial error propagation. They propose a minimal-intervention approach using a lightweight planner and temperature annealing to guide early token selection, achieving substantial improvements on reasoning and planning tasks.

AIBullisharXiv – CS AI · Mar 176/10

🧠

SyncSpeech: Efficient and Low-Latency Text-to-Speech based on Temporal Masked Transformer

Researchers introduce SyncSpeech, a new text-to-speech model that combines autoregressive and non-autoregressive approaches using a Temporal Mask Transformer architecture. The model achieves 5.8x lower first-packet latency and 8.8x improved real-time performance while maintaining comparable speech quality to existing models.

AINeutralarXiv – CS AI · Feb 276/1011

🧠

Why Diffusion Language Models Struggle with Truly Parallel (Non-Autoregressive) Decoding?

Researchers identify why Diffusion Language Models (DLMs) struggle with parallel token generation, finding that training data structure forces autoregressive-like behavior. They propose NAP, a data-centric approach using multiple independent reasoning trajectories that improves parallel decoding performance on math benchmarks.