173 articles tagged with #diffusion-models. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBullisharXiv โ CS AI ยท Mar 36/1012
๐ง Researchers developed FMCT/EFMCT, a new Flow Matching-based framework for CT medical imaging reconstruction that significantly improves computational efficiency over existing diffusion models. The method uses deterministic ordinary differential equations and velocity field reuse to reduce neural network evaluations while maintaining reconstruction quality.
AIBullisharXiv โ CS AI ยท Mar 36/108
๐ง Researchers introduced RAISE, a training-free evolutionary framework that improves text-to-image generation by adaptively refining outputs based on prompt complexity. The system achieves state-of-the-art alignment scores while reducing computational costs by 30-80% compared to existing methods.
AIBullisharXiv โ CS AI ยท Mar 36/107
๐ง Researchers propose ArtiFixer, a two-stage pipeline using auto-regressive diffusion models to enhance 3D reconstruction quality. The method addresses scalability and quality issues in existing approaches by training a bidirectional generative model with opacity mixing, then distilling it into a causal auto-regressive model that generates hundreds of frames in a single pass.
AIBullisharXiv โ CS AI ยท Mar 36/108
๐ง IdGlow introduces a new AI framework for generating images with multiple subjects that preserves individual identities while creating coherent scenes. The system uses a two-stage approach with Flow Matching diffusion models and addresses the challenge of maintaining identity fidelity during complex transformations like age changes.
AIBullisharXiv โ CS AI ยท Mar 36/107
๐ง Researchers developed a spatiotemporal diffusion autoencoder using CT brain images to predict stroke outcomes and evolution. The AI model achieved best-in-class performance for predicting next-day severity and functional outcomes using a dataset of 5,824 CT images from 3,573 patients across two medical centers.
AINeutralarXiv โ CS AI ยท Mar 37/107
๐ง Researchers introduce SurgUn, a surgical unlearning method for text-to-image diffusion models that enables precise removal of specific visual concepts while preserving other capabilities. The approach addresses challenges in copyright compliance and content policy enforcement by applying targeted weight-space updates based on retroactive interference theory.
AINeutralarXiv โ CS AI ยท Mar 37/107
๐ง Researchers introduced EraseAnything++, a new framework for removing unwanted concepts from advanced AI image and video generation models like Stable Diffusion v3 and Flux. The method uses multi-objective optimization to balance concept removal while preserving overall generative quality, showing superior performance compared to existing approaches.
AIBullisharXiv โ CS AI ยท Mar 36/107
๐ง Researchers developed an open-source modular benchmark for evaluating diffusion-based motion planners in real-world autonomous driving systems. The system integrates with Autoware ROS 2 stack and achieves 3.2x latency reduction through encoder caching while improving accuracy by 41% with second-order solving.
AIBullisharXiv โ CS AI ยท Mar 36/106
๐ง Researchers introduce MetaState, a recurrent augmentation for discrete diffusion language models (dLLMs) that adds persistent working memory to improve text generation quality. The system addresses the 'Information Island' problem where intermediate representations are discarded between denoising steps, achieving improved accuracy on LLaDA-8B and Dream-7B models with minimal parameter overhead.
AIBullisharXiv โ CS AI ยท Mar 36/106
๐ง Researchers introduce 3R, a new RAG-based framework that optimizes prompts for text-to-video generation models without requiring model retraining. The system uses three key strategies to improve video quality: RAG-based modifier extraction, diffusion-based preference optimization, and temporal frame interpolation for better consistency.
AIBullisharXiv โ CS AI ยท Mar 37/107
๐ง Researchers propose Likelihood-Free Policy Optimization (LFPO), a new framework for improving Diffusion Large Language Models by bypassing likelihood computation issues that plague existing methods. LFPO uses geometric velocity rectification to optimize denoising logits directly, achieving better performance on code and reasoning tasks while reducing inference time by 20%.
AIBullisharXiv โ CS AI ยท Mar 36/108
๐ง Researchers propose FAST-DIPS, a new training-free diffusion prior method for solving inverse problems that achieves up to 19.5x speedup while maintaining competitive image quality metrics. The method replaces computationally expensive inner optimization loops with closed-form projections and analytic step sizes, significantly reducing the number of required denoiser evaluations.
AIBullisharXiv โ CS AI ยท Mar 37/104
๐ง Researchers propose FreeAct, a new quantization framework for Large Language Models that improves efficiency by using dynamic transformation matrices for different token types. The method achieves up to 5.3% performance improvement over existing approaches by addressing the memory and computational overhead challenges in LLMs.
AIBullisharXiv โ CS AI ยท Mar 36/104
๐ง Researchers have developed DCDP, a Dynamic Closed-Loop Diffusion Policy framework that significantly improves robotic manipulation in dynamic environments. The system achieves 19% better adaptability without retraining while requiring only 5% additional computational overhead through real-time action correction and environmental dynamics integration.
AIBullisharXiv โ CS AI ยท Mar 36/104
๐ง Researchers developed MAP-Diff, a multi-anchor guided diffusion framework that improves 3D whole-body PET scan denoising by using intermediate-dose scans as trajectory anchors. The method achieves significant improvements in image quality metrics, increasing PSNR from 42.48 dB to 43.71 dB while reducing radiation exposure for patients.
AIBullisharXiv โ CS AI ยท Mar 36/104
๐ง LiftAvatar is a new AI system that enhances 3D avatar animation by completing sparse monocular video observations in kinematic space using expression-controlled video diffusion Transformers. The technology addresses limitations in 3D Gaussian Splatting-based avatars by generating high-quality, temporally coherent facial expressions from single or multiple reference images.
AIBullisharXiv โ CS AI ยท Mar 36/103
๐ง Sketch2Colab is a new AI system that converts 2D sketches into realistic 3D multi-human animations with precise control over interactions and movements. The technology uses a novel approach combining sketch-driven diffusion with rectified-flow distillation for faster, more stable animation generation than existing methods.
AIBullisharXiv โ CS AI ยท Mar 36/103
๐ง Researchers introduce SounDiT, a new AI model that generates realistic landscape images from environmental soundscapes using geo-contextual data. The model uses diffusion transformer technology and is trained on two large-scale datasets pairing environmental sounds with real-world landscape images.
AIBullisharXiv โ CS AI ยท Mar 36/103
๐ง Researchers propose EquiReg, a new framework that improves diffusion models for inverse problems like image restoration by keeping sampling trajectories on the data manifold. The method uses equivariance regularization to guide sampling toward symmetry-preserving regions, enabling high-quality reconstructions with fewer sampling steps.
AIBullisharXiv โ CS AI ยท Mar 36/104
๐ง Researchers propose a new iterative distillation framework for fine-tuning diffusion models in biomolecular design that optimizes for specific reward functions. The method addresses stability and efficiency issues in existing reinforcement learning approaches by using off-policy data collection and KL divergence minimization for improved training stability.
AIBullisharXiv โ CS AI ยท Mar 36/103
๐ง Researchers introduce SHINE, a training-free framework that enables FLUX and other diffusion models to perform high-quality image composition without retraining. The framework addresses complex lighting scenarios like shadows and reflections, achieving state-of-the-art performance on new benchmark ComplexCompo.
AIBullisharXiv โ CS AI ยท Mar 26/1014
๐ง Researchers introduce SALIENT, a frequency-aware diffusion model framework that improves detection of rare lesions in CT scans by generating synthetic training data in wavelet domain rather than pixel space. The approach addresses extreme class imbalance in medical imaging through controllable augmentation, achieving significant improvements in detection performance for low-prevalence conditions.
AIBullisharXiv โ CS AI ยท Mar 26/1015
๐ง Researchers introduce DiffusionHarmonizer, an AI framework that enhances neural reconstruction simulations for autonomous robots by converting multi-step image diffusion models into single-step enhancers. The system addresses artifacts in NeRF and 3D Gaussian Splatting methods while improving realism for applications like self-driving vehicle simulation.
AIBullisharXiv โ CS AI ยท Mar 27/1017
๐ง SceneTok introduces a novel 3D scene tokenizer that compresses view sets into permutation-invariant tokens, achieving 1-3 orders of magnitude better compression than existing methods while maintaining state-of-the-art reconstruction quality. The system enables efficient 3D scene generation in 5 seconds using a lightweight decoder that can render novel viewpoints.
AIBullisharXiv โ CS AI ยท Mar 27/1019
๐ง Researchers have developed a safety filtering framework that ensures AI generative models like diffusion models produce outputs that satisfy hard constraints without requiring model retraining. The approach uses Control Barrier Functions to create a 'constricting safety tube' that progressively tightens constraints during the generation process, achieving 100% constraint satisfaction across image generation, trajectory sampling, and robotic manipulation tasks.