y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#generative-models News & Analysis

80 articles tagged with #generative-models. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

80 articles
AIBullisharXiv – CS AI · 2d ago7/10
🧠

Offline Reinforcement Learning with Generative Trajectory Policies

Researchers propose Generative Trajectory Policies (GTPs), a unified framework for offline reinforcement learning that bridges the performance gap between slow diffusion models and fast consistency policies by learning continuous-time generative trajectories. The approach achieves state-of-the-art results on D4RL benchmarks, including perfect scores on difficult AntMaze tasks.

AIBullisharXiv – CS AI · 4d ago7/10
🧠

Aligning Few-Step Generative Models by Amortizing Sample-based Variational Inference

Researchers introduce FAV, a novel framework for aligning few-step generative models that requires only sample access to generators and reference distributions. The method uses Stein Variational Gradient Descent to cast alignment as sampling from reward-tilted distributions, demonstrating superior performance across robotic manipulation tasks and scaling to high-resolution image synthesis.

AIBullisharXiv – CS AI · 4d ago7/10
🧠

Scalable GANs with Transformers

Researchers introduce GAT, a transformer-based GAN architecture trained in VAE latent space that achieves state-of-the-art image generation performance. The model reaches FID 2.96 on ImageNet-256 in just 40 epochs, 6x faster than comparable baselines, while scaling reliably from small to extra-large capacities.

AIBullisharXiv – CS AI · 4d ago7/10
🧠

Recursive Flow Matching

Researchers introduce Recursive Flow Matching (RecFM), a generative AI framework that significantly improves the speed and accuracy of physics simulations by enforcing self-consistency across computational scales. The method achieves high-fidelity predictions in 1-4 steps with up to 20× speedup over existing diffusion models while reducing error by 15%, addressing a critical bottleneck in scientific computing.

AIBullisharXiv – CS AI · May 127/10
🧠

Yeti: A compact protein structure tokenizer for reconstruction and multi-modal generation

Researchers introduce Yeti, a compact protein structure tokenizer that converts protein structures into discrete tokens for multimodal AI models. The approach achieves superior codebook utilization and token diversity while maintaining competitive reconstruction accuracy with 10x fewer parameters than existing solutions, enabling efficient joint generation of protein sequences and structures.

AIBullisharXiv – CS AI · May 127/10
🧠

On Variance Reduction in Learning Mean Flows

Researchers identify and resolve a critical instability in MeanFlow training for one-step generative models by correcting how the conditional velocity field is used in loss calculations. The fix, derived in closed form, improves sample quality by up to 54% on benchmarks and produces monotonic FID improvements across diffusion transformer checkpoints, though revealing a practical FID-MSE landscape mismatch.

AIBullisharXiv – CS AI · May 117/10
🧠

FlashMol: High-Quality Molecule Generation in as Few as Four Steps

FlashMol represents a major breakthrough in computational drug discovery by generating high-quality 3D molecular conformations in just 4 steps, compared to hundreds required by traditional diffusion models. The technique achieves 250x acceleration in sampling speed while matching or exceeding the quality of slower teacher models, potentially transforming the economics of large-scale in silico screening.

AIBullisharXiv – CS AI · May 117/10
🧠

APEX: Assumption-free Projection-based Embedding eXamination Metric for Image Quality Assessment

Researchers introduce APEX, a novel image quality assessment metric that addresses fundamental limitations in existing evaluation methods like FID by using Sliced Wasserstein Distance and modern foundation models (CLIP, DINOv2) as embedding-agnostic feature extractors. The framework eliminates parametric assumptions while maintaining scalability to high-dimensional spaces, demonstrating superior robustness and stability across datasets.

AIBearisharXiv – CS AI · May 117/10
🧠

An Embarrassingly Simple Graph Heuristic Reveals Shortcut-Solvable Benchmarks for Sequential Recommendation

Researchers demonstrate that a simple graph heuristic without machine learning matches or outperforms advanced generative recommendation systems on standard benchmarks, revealing that widely-used datasets contain structural shortcuts that don't require sophisticated modeling. The findings question whether current benchmark evaluations actually validate the advanced capabilities that modern recommendation systems claim to provide.

AIBullisharXiv – CS AI · May 97/10
🧠

MidSteer: Optimal Affine Framework for Steering Generative Models

Researchers introduce MidSteer, a theoretical framework for steering generative models through intermediate representation manipulation. The work formalizes concept steering as an optimization problem, demonstrating that existing safety alignment methods are special cases of affine transformations, with applications across vision and language models.

AIBullisharXiv – CS AI · Apr 147/10
🧠

Bringing Value Models Back: Generative Critics for Value Modeling in LLM Reinforcement Learning

Researchers propose Generative Actor-Critic (GenAC), a new approach to value modeling in large language model reinforcement learning that uses chain-of-thought reasoning instead of one-shot scalar predictions. The method addresses a longstanding challenge in credit assignment by improving value approximation and downstream RL performance compared to existing value-based and value-free baselines.

AIBullisharXiv – CS AI · Mar 127/10
🧠

Gradient Flow Drifting: Generative Modeling via Wasserstein Gradient Flows of KDE-Approximated Divergences

Researchers introduce Gradient Flow Drifting, a new mathematical framework for generative AI models that connects the Drifting Model to Wasserstein gradient flows of KL divergence under kernel density estimation. The framework includes a mixed-divergence strategy to avoid mode collapse and extends to Riemannian manifolds for improved semantic space applications.

$KL
AIBullisharXiv – CS AI · Mar 57/10
🧠

MPFlow: Multi-modal Posterior-Guided Flow Matching for Zero-Shot MRI Reconstruction

Researchers developed MPFlow, a new zero-shot MRI reconstruction framework that uses multi-modal data and rectified flow to improve medical imaging quality. The system reduces tumor hallucinations by 15% while using 80% fewer sampling steps compared to existing diffusion methods, potentially advancing AI applications in medical diagnostics.

AINeutralarXiv – CS AI · Mar 57/10
🧠

InEdit-Bench: Benchmarking Intermediate Logical Pathways for Intelligent Image Editing Models

Researchers introduced InEdit-Bench, the first evaluation benchmark specifically designed to test image editing models' ability to reason through intermediate logical pathways in multi-step visual transformations. Testing 14 representative models revealed significant shortcomings in handling complex scenarios requiring dynamic reasoning and procedural understanding.

AIBullisharXiv – CS AI · Mar 46/102
🧠

Rigidity-Aware Geometric Pretraining for Protein Design and Conformational Ensembles

Researchers introduce RigidSSL, a new geometric pretraining framework for protein design that improves designability by up to 43% and enhances success rates in protein generation tasks. The two-phase approach combines geometric learning from 432K protein structures with molecular dynamics refinement to better capture protein conformational dynamics.

AINeutralarXiv – CS AI · Mar 47/103
🧠

Unsupervised Representation Learning -- an Invariant Risk Minimization Perspective

Researchers propose a new unsupervised framework for Invariant Risk Minimization (IRM) that learns robust representations without labeled data. The approach introduces two methods - Principal Invariant Component Analysis (PICA) and Variational Invariant Autoencoder (VIAE) - that can capture invariant structures across different environments using only unlabeled data.

AIBullisharXiv – CS AI · Mar 46/102
🧠

CoBELa: Steering Transparent Generation via Concept Bottlenecks on Energy Landscapes

Researchers introduce CoBELa, a new AI framework for interpretable image generation that uses concept bottlenecks on energy landscapes to enable transparent, controllable synthesis without requiring decoder retraining. The system achieves strong performance on benchmark datasets while allowing users to compositionally manipulate concepts through energy function combinations.

AIBullisharXiv – CS AI · Feb 277/103
🧠

Controlling Exploration-Exploitation in GFlowNets via Markov Chain Perspectives

Researchers introduce α-GFNs, an enhanced version of Generative Flow Networks that allows tunable control over exploration-exploitation dynamics through a parameter α. The method achieves up to 10× improvement in mode discovery across various benchmarks by addressing constraints in traditional GFlowNet objectives through Markov chain theory.

$LINK
AINeutralarXiv – CS AI · 2d ago6/10
🧠

PrismFlow: Residual Dynamics for Flow Matching in Time-Series Generation

PrismFlow introduces a novel Flow Matching method for time-series generation that uses Koopman-inspired dynamical experts to address spectral distortion problems in existing models. By employing residual corrections and confidence-aware expert selection, the approach achieves significant performance improvements (15.6% gain in Context-FID, 38.6% in Discriminative Score) while maintaining stability and effectiveness in low-data scenarios.

AINeutralarXiv – CS AI · 2d ago6/10
🧠

Nano World Models: A Minimalist Implementation of Future Video Prediction

Researchers introduce Nano World Models, an open-source minimalist framework for future video prediction using diffusion forcing. The release provides the research community with a compact, reproducible codebase and pretrained checkpoints to study world-modeling components that are typically scattered across industry implementations.

AINeutralarXiv – CS AI · 3d ago6/10
🧠

Constrained Auto-Bidding via Generative Response Modeling

Researchers introduce Generative Response Model (GRM), a machine learning approach that optimizes digital advertising bidding by predicting future traffic and cost outcomes rather than making individual bid decisions. The system enforces budget and performance constraints through analytic controllers, demonstrating improved stability and performance over existing auto-bidding methods.

AINeutralarXiv – CS AI · 3d ago6/10
🧠

SmartDirector: Keyframe-Conditioned Cinematic Video Generation with Narrative Pacing Control

SmartDirector is a new AI framework for video generation that uses multiple keyframes to enable precise control over narrative structure and temporal pacing, supporting single-shot generation, multi-shot synthesis, and video extension through a two-stage process combining low-resolution generation with high-resolution refinement.

AINeutralarXiv – CS AI · 3d ago6/10
🧠

STFlow: Data-Coupled Flow Matching for Geometric Trajectory Simulation

Researchers introduce STFlow, a machine learning model that improves trajectory simulation for complex dynamical systems by using graph neural networks and data-dependent couplings within a Flow Matching framework. The approach outperforms existing methods on molecular dynamics, N-body systems, and pedestrian forecasting with fewer simulation steps and lower computational costs.

Page 1 of 4Next →