y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#diffusion-models News & Analysis

173 articles tagged with #diffusion-models. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

173 articles
AIBullisharXiv – CS AI Β· 2d ago7/10
🧠

FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion Language Models

Researchers introduce FS-DFM, a discrete flow-matching model that generates long text 128x faster than standard diffusion models while maintaining quality parity. The breakthrough uses few-step sampling with teacher guidance distillation, achieving in 8 steps what previously required 1,024 evaluations.

🏒 Perplexity
AIBullisharXiv – CS AI Β· 2d ago7/10
🧠

PnP-CM: Consistency Models as Plug-and-Play Priors for Inverse Problems

Researchers introduce PnP-CM, a new method that reformulates consistency models as proximal operators within plug-and-play frameworks for solving inverse problems. The approach achieves high-quality image reconstructions with minimal neural function evaluations (4 NFEs), demonstrating practical efficiency gains over existing consistency model solvers and marking the first application of CMs to MRI data.

AIBullisharXiv – CS AI Β· 2d ago7/10
🧠

Introspective Diffusion Language Models

Researchers introduce Introspective Diffusion Language Models (I-DLM), a new approach that combines the parallel generation speed of diffusion models with the quality of autoregressive models by ensuring models verify their own outputs. I-DLM achieves performance matching conventional large language models while delivering 3x higher throughput, potentially reshaping how AI systems are deployed at scale.

AIBearisharXiv – CS AI Β· 3d ago7/10
🧠

Re-Mask and Redirect: Exploiting Denoising Irreversibility in Diffusion Language Models

Researchers demonstrate a critical vulnerability in diffusion-based language models where safety mechanisms can be bypassed by re-masking committed refusal tokens and injecting affirmative prefixes, achieving 76-82% attack success rates without gradient optimization. The findings reveal that dLLM safety relies on a fragile architectural assumption rather than robust adversarial defenses.

AIBullisharXiv – CS AI Β· 3d ago7/10
🧠

Advantage-Guided Diffusion for Model-Based Reinforcement Learning

Researchers propose Advantage-Guided Diffusion (AGD-MBRL), a novel approach that improves model-based reinforcement learning by using advantage estimates to guide diffusion models during trajectory generation. The method addresses the short-horizon myopia problem in existing diffusion-based world models and demonstrates 2x performance improvements over current baselines on MuJoCo control tasks.

AIBullisharXiv – CS AI Β· 6d ago7/10
🧠

DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models

DiffSketcher is a novel AI algorithm that generates vector sketches from text prompts by leveraging pre-trained text-to-image diffusion models. The method optimizes BΓ©zier curves using an extended Score Distillation Sampling loss and introduces a stroke initialization strategy based on attention maps, achieving superior results in sketch quality and controllability.

AIBullisharXiv – CS AI Β· 6d ago7/10
🧠

Less is More: Data-Efficient Adaptation for Controllable Text-to-Video Generation

Researchers demonstrate a data-efficient fine-tuning method for text-to-video diffusion models that enables new generative controls using sparse, low-quality synthetic data rather than expensive, photorealistic datasets. Counterintuitively, models trained on simple synthetic data outperform those trained on high-fidelity real data, supported by both empirical results and theoretical justification.

AIBullisharXiv – CS AI Β· Apr 77/10
🧠

Unlocking Prompt Infilling Capability for Diffusion Language Models

Researchers have developed a method to unlock prompt infilling capabilities in masked diffusion language models by extending full-sequence masking during supervised fine-tuning, rather than the conventional response-only masking. This breakthrough enables models to automatically generate effective prompts that match or exceed manually designed templates, suggesting training practices rather than architectural limitations were the primary constraint.

AIBullisharXiv – CS AI Β· Apr 67/10
🧠

OSCAR: Orchestrated Self-verification and Cross-path Refinement

Researchers introduce OSCAR, a training-free framework that reduces AI hallucinations in diffusion language models by using cross-chain entropy to detect uncertain token positions during generation. The system runs parallel denoising chains and performs targeted remasking with retrieved evidence to improve factual accuracy without requiring external hallucination classifiers.

AINeutralarXiv – CS AI Β· Mar 277/10
🧠

DiffuGuard: How Intrinsic Safety is Lost and Found in Diffusion Large Language Models

Researchers identified critical security vulnerabilities in Diffusion Large Language Models (dLLMs) that differ from traditional autoregressive LLMs, stemming from their iterative generation process. They developed DiffuGuard, a training-free defense framework that reduces jailbreak attack success rates from 47.9% to 14.7% while maintaining model performance.

AIBullisharXiv – CS AI Β· Mar 277/10
🧠

LLM4AD: Large Language Models for Autonomous Driving -- Concept, Review, Benchmark, Experiments, and Future Trends

Researchers have published a comprehensive review of Large Language Models for Autonomous Driving (LLM4AD), introducing new benchmarks and conducting real-world experiments on autonomous vehicle platforms. The paper explores how LLMs can enhance perception, decision-making, and motion control in self-driving cars, while identifying key challenges including latency, security, and safety concerns.

AIBearisharXiv – CS AI Β· Mar 267/10
🧠

When Understanding Becomes a Risk: Authenticity and Safety Risks in the Emerging Image Generation Paradigm

Research reveals that multimodal large language models (MLLMs) pose greater safety risks than diffusion models for image generation, producing more unsafe content and creating images that are harder for detection systems to identify. The enhanced semantic understanding capabilities of MLLMs, while more powerful, enable them to interpret complex prompts that lead to dangerous outputs including fake image synthesis.

AINeutralarXiv – CS AI Β· Mar 267/10
🧠

Anti-I2V: Safeguarding your photos from malicious image-to-video generation

Researchers developed Anti-I2V, a new defense system that protects personal photos from being used to create malicious deepfake videos through image-to-video AI models. The system works across different AI architectures by operating in multiple domains and targeting specific network layers to degrade video generation quality.

AIBullisharXiv – CS AI Β· Mar 177/10
🧠

UniVid: Pyramid Diffusion Model for High Quality Video Generation

Researchers have developed UniVid, a new pyramid diffusion model that unifies text-to-video and image-to-video generation into a single system. The model uses dual-stream cross-attention mechanisms to process both text prompts and reference images, achieving superior temporal coherence across different video generation tasks.

AIBullisharXiv – CS AI Β· Mar 177/10
🧠

Masked Auto-Regressive Variational Acceleration: Fast Inference Makes Practical Reinforcement Learning

Researchers introduce MARVAL, a distillation framework that accelerates masked auto-regressive diffusion models by compressing inference into a single step while enabling practical reinforcement learning applications. The method achieves 30x speedup on ImageNet with comparable quality, making RL post-training feasible for the first time with these models.

AINeutralarXiv – CS AI Β· Mar 177/10
🧠

Safety-Guided Flow (SGF): A Unified Framework for Negative Guidance in Safe Generation

Researchers introduce Safety-Guided Flow (SGF), a unified probabilistic framework that combines control barrier functions with negative guidance approaches to improve safety in AI-generated content. The framework identifies a critical time window during the denoising process where strong negative guidance is most effective for preventing harmful outputs.

AIBullisharXiv – CS AI Β· Mar 177/10
🧠

LESA: Learnable Stage-Aware Predictors for Diffusion Model Acceleration

Researchers propose LESA, a new framework that accelerates Diffusion Transformers (DiTs) by up to 6.25x using learnable predictors and Kolmogorov-Arnold Networks. The method achieves significant speedups while maintaining or improving generation quality in text-to-image and text-to-video synthesis tasks.

AIBullisharXiv – CS AI Β· Mar 167/10
🧠

Reinforcement Learning for Diffusion LLMs with Entropy-Guided Step Selection and Stepwise Advantages

Researchers developed a new reinforcement learning approach for training diffusion language models that uses entropy-guided step selection and stepwise advantages to overcome challenges with sequence-level likelihood calculations. The method achieves state-of-the-art results on coding and logical reasoning benchmarks while being more computationally efficient than existing approaches.

AIBearisharXiv – CS AI Β· Mar 167/10
🧠

Purify Once, Edit Freely: Breaking Image Protections under Model Mismatch

Researchers have identified a critical vulnerability in image protection systems that use adversarial perturbations to prevent unauthorized AI editing. Two new purification methods can effectively remove these protections, creating a 'purify-once, edit-freely' attack where images become vulnerable to unlimited manipulation.

AIBullisharXiv – CS AI Β· Mar 127/10
🧠

ES-dLLM: Efficient Inference for Diffusion Large Language Models by Early-Skipping

Researchers developed ES-dLLM, a training-free inference acceleration framework that speeds up diffusion large language models by selectively skipping tokens in early layers based on importance scoring. The method achieves 5.6x to 16.8x speedup over vanilla implementations while maintaining generation quality, offering a promising alternative to autoregressive models.

🏒 Nvidia
AIBearisharXiv – CS AI Β· Mar 117/10
🧠

NetDiffuser: Deceiving DNN-Based Network Attack Detection Systems with Diffusion-Generated Adversarial Traffic

Researchers developed NetDiffuser, a framework that uses diffusion models to generate natural adversarial examples capable of deceiving AI-based network intrusion detection systems. The system achieved up to 29.93% higher attack success rates compared to baseline attacks, highlighting significant vulnerabilities in current deep learning-based security systems.

AIBullisharXiv – CS AI Β· Mar 117/10
🧠

Reviving ConvNeXt for Efficient Convolutional Diffusion Models

Researchers introduce FCDM, a fully convolutional diffusion model based on ConvNeXt architecture that achieves competitive performance with DiT-XL/2 using only 50% of the computational resources. The model demonstrates exceptional training efficiency, requiring 7x fewer training steps and can be trained on just 4 GPUs, reviving convolutional networks as an efficient alternative to Transformer-based diffusion models.

AIBullisharXiv – CS AI Β· Mar 97/10
🧠

Physical Simulator In-the-Loop Video Generation

Researchers introduce PSIVG, a framework that integrates physical simulators into AI video generation to ensure generated videos obey real-world physics like gravity and collision. The system reconstructs 4D scenes from template videos and uses physical simulations to guide video generators toward more realistic motion while maintaining visual quality.

Page 1 of 7Next β†’