y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#parallel-generation News & Analysis

8 articles tagged with #parallel-generation. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

8 articles
AIBullisharXiv – CS AI · 4d ago7/10
🧠

EPIC: Efficient and Parallel Inference under CFG Constraints for Diffusion Language Models

Researchers introduce EPIC, an efficient decoding framework for diffusion language models that operate under context-free grammar constraints. The method reduces inference time by up to 67.5% compared to existing CFG-constrained approaches while preserving the parallel decoding advantage that makes diffusion models competitive with autoregressive alternatives.

AIBullisharXiv – CS AI · May 17/10
🧠

Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed

Researchers introduce Efficient-DLM, a framework for converting pretrained autoregressive language models into diffusion language models that enable parallel, non-autoregressive generation. The approach uses block-wise attention patterns and position-dependent masking to preserve model accuracy while achieving 4.5x higher throughput compared to existing models.

AIBullisharXiv – CS AI · Apr 147/10
🧠

Introspective Diffusion Language Models

Researchers introduce Introspective Diffusion Language Models (I-DLM), a new approach that combines the parallel generation speed of diffusion models with the quality of autoregressive models by ensuring models verify their own outputs. I-DLM achieves performance matching conventional large language models while delivering 3x higher throughput, potentially reshaping how AI systems are deployed at scale.

AIBullisharXiv – CS AI · Mar 37/103
🧠

On the Reasoning Abilities of Masked Diffusion Language Models

New research demonstrates that Masked Diffusion Models (MDMs) for text generation are computationally equivalent to chain-of-thought augmented transformers in finite-precision settings. The study proves MDMs can solve all reasoning problems that CoT transformers can, while being more efficient for certain problem classes due to parallel generation capabilities.

AIBullisharXiv – CS AI · 1d ago6/10
🧠

Self-Augmenting Retrieval for Diffusion Language Models

Researchers introduce SARDI, a training-free retrieval-augmented generation framework for discrete diffusion language models that leverages low-confidence token predictions as lookahead signals to guide information retrieval during text generation. The approach achieves significant performance gains on multi-hop question-answering tasks while operating at substantially higher throughput than existing baselines.

AIBullisharXiv – CS AI · May 126/10
🧠

TAD: Temporal-Aware Trajectory Self-Distillation for Fast and Accurate Diffusion LLM

Researchers introduce TAD, a temporal-aware self-distillation framework that improves diffusion large language models' accuracy-parallelism trade-off by using adaptive loss functions based on token decoding timelines. The method increases accuracy from 46.2% to 51.6% while enabling aggressive acceleration modes, addressing a fundamental limitation in parallel text generation.

AIBullisharXiv – CS AI · Mar 37/108
🧠

Breaking the Factorization Barrier in Diffusion Language Models

Researchers introduce Coupled Discrete Diffusion (CoDD), a breakthrough framework that solves the "factorization barrier" in diffusion language models by enabling parallel token generation without sacrificing coherence. The approach uses a lightweight probabilistic inference layer to model complex joint dependencies while maintaining computational efficiency.