#latent-space News & Analysis

21 articles tagged with #latent-space. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

21 articles

AIBullisharXiv – CS AI · May 277/10

🧠

Scalable GANs with Transformers

Researchers introduce GAT, a transformer-based GAN architecture trained in VAE latent space that achieves state-of-the-art image generation performance. The model reaches FID 2.96 on ImageNet-256 in just 40 epochs, 6x faster than comparable baselines, while scaling reliably from small to extra-large capacities.

AINeutralarXiv – CS AI · Mar 57/10

🧠

Old Habits Die Hard: How Conversational History Geometrically Traps LLMs

Researchers introduce History-Echoes, a framework revealing how large language models become trapped by their conversational history, with past interactions creating geometric constraints in latent space that bias future responses. The study demonstrates that behavioral persistence in LLMs manifests as mathematical traps where previous hallucinations and responses influence subsequent model behavior across multiple model families and datasets.

AIBullisharXiv – CS AI · Mar 57/10

🧠

Low-Resource Guidance for Controllable Latent Audio Diffusion

Researchers have developed a new method called Latent-Control Heads (LatCHs) that enables efficient control of audio generation in diffusion models with significantly reduced computational costs. The approach operates directly in latent space, avoiding expensive decoder steps and requiring only 7M parameters and 4 hours of training while maintaining audio quality.

AIBullisharXiv – CS AI · Mar 47/103

🧠

LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning

Researchers introduce LaDiR (Latent Diffusion Reasoner), a novel framework that combines continuous latent representation with iterative refinement capabilities to enhance Large Language Models' reasoning abilities. The system uses a Variational Autoencoder to encode reasoning steps and a latent diffusion model for parallel generation of diverse reasoning trajectories, showing improved accuracy and interpretability in mathematical reasoning benchmarks.

AINeutralarXiv – CS AI · 16h ago6/10

🧠

SymTRELLIS: Symmetry-Enforced Voxel Latents for 3D Generation

SymTRELLIS introduces a method to enforce geometric symmetries in 3D generative models without retraining underlying systems, using learned linear operators on voxel latents and velocity symmetrization during generation. The technique substantially reduces symmetry violations across rotational, reflectional, and polyhedral symmetries compared to existing models like TRELLIS.2 and Hunyuan3D-2.1.

AINeutralarXiv – CS AI · 2d ago6/10

🧠

BRo-JEPA: Learning Modular Arithmetic in Latent Space

Researchers introduce BRo-JEPA, a neural network architecture that learns modular arithmetic rules by imposing circular structure in latent space, achieving 99.46% zero-shot generalization on unseen operations. The work demonstrates that neural networks can learn abstract algebraic rules rather than merely memorizing patterns when architecture aligns with problem structure.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

Latent Space Disentanglement via Activation Steering for Interpretable Attribute Control in Symbolic Music Generation

Researchers propose a novel framework for controlling symbolic music generation in Transformer models through activation steering, enabling fine-grained control over musical attributes like pitch and duration without retraining. The approach uses latent space analysis and orthogonalization techniques to independently manipulate multiple attributes while reducing interference and maintaining generation quality.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

Lumos-Nexus: Efficient Frequency Bridging with Homogeneous Latent Space for Video Unified Models

Lumos-Nexus is a new video generation framework that separates training and inference to improve both reasoning quality and visual fidelity. The system uses a lightweight generator during training and progressively hands off to a high-capacity generator during inference through a technique called Unified Progressive Frequency Bridging, while introducing VR-Bench as a benchmark for reasoning-driven video generation.

AINeutralarXiv – CS AI · 6d ago6/10

🧠

Autoregression-Free Neural Operators for Time-Dependent PDEs

Researchers propose Autoregression-Free Neural Operators (AFNO), a new approach for solving time-dependent partial differential equations that models continuous-time evolution in latent space rather than performing recursive predictions. By avoiding autoregressive rollout and using flow matching, AFNO reduces error accumulation over long-horizon predictions and demonstrates improved stability across six PDE benchmarks.

AINeutralarXiv – CS AI · May 276/10

🧠

Falcon-X: A Time Series Foundation Model for Heterogeneous Multivariate Modeling

Falcon-X is a new time series foundation model that improves multivariate forecasting by mapping heterogeneous data types into a unified latent space rather than processing raw variables directly. The model uses novel attention mechanisms to capture both positive and negative relationships between variables, achieving state-of-the-art performance on forecasting benchmarks.

AINeutralarXiv – CS AI · May 126/10

🧠

diffGHOST: Diffusion based Generative Hedged Oblivious Synthetic Trajectories

diffGHOST is a new conditional diffusion model that synthesizes mobility trajectories while preserving privacy through latent space segmentation. The approach addresses a critical gap in existing generative models that lack formal privacy guarantees despite handling sensitive personal movement data.

AINeutralarXiv – CS AI · May 126/10

🧠

NoisyCoconut: Counterfactual Consensus via Latent Space Reasoning

NoisyCoconut is an inference-time method that improves LLM reliability by injecting controlled noise into internal representations to generate diverse reasoning paths, enabling models to abstain when uncertain without requiring retraining. The technique reduces error rates from 40-70% to below 15% on mathematical reasoning tasks through unanimous agreement among noise-perturbed paths, offering practical reliability improvements compatible with existing models.

AIBullisharXiv – CS AI · May 126/10

🧠

Why Do DiT Editors Drift? Plug-and-Play Low Frequency Alignment in VAE Latent Space

Researchers have identified why diffusion transformers (DiTs) degrade in quality during multi-turn image editing and proposed VAE-LFA, a training-free alignment method that operates in VAE latent space to suppress accumulated semantic drift. The solution works with both white-box and black-box models by aligning low-frequency components across editing rounds while preserving high-frequency details.

AINeutralarXiv – CS AI · May 116/10

🧠

Multimodal synthesis of MRI and tabular data with diffusion in a joint latent space via cross-attention

Researchers have developed a multimodal latent diffusion model that simultaneously synthesizes MRI brain scans and clinical tabular data (age, sex, body measurements) within a shared latent space using cross-attention mechanisms. Tested on over 10,000 participants from the German National Cohort, the system generates anatomically plausible synthetic medical data where image and tabular attributes remain coherently aligned, representing the first successful joint modeling of volumetric medical images with mixed-type clinical data.

AIBullisharXiv – CS AI · May 96/10

🧠

Memory Inception: Latent-Space KV Cache Manipulation for Steering LLMs

Researchers introduce Memory Inception (MI), a training-free method for steering large language models by inserting text-derived key-value banks at selected attention layers rather than caching full prompts. MI achieves competitive control with instruction prompting while using up to 118x less storage and outperforms existing activation steering methods on personality, reasoning, and guidance tasks.

AINeutralarXiv – CS AI · May 96/10

🧠

Continuous Latent Diffusion Language Model

Researchers propose Cola DLM, a hierarchical latent diffusion language model that generates text through continuous semantic modeling rather than traditional left-to-right autoregressive decoding. The approach achieves comparable performance to autoregressive models while offering greater flexibility, better scaling properties, and a potential pathway for unified modeling across discrete and continuous modalities.

AINeutralarXiv – CS AI · May 46/10

🧠

Physically Native World Models: A Hamiltonian Perspective on Generative World Modeling

Researchers propose Hamiltonian World Models, a physics-grounded approach to generative world modeling that encodes observations into structured latent phase spaces and evolves them through Hamiltonian-inspired dynamics. The framework aims to address limitations in current world models by prioritizing physical accuracy and action-controllability alongside visual realism, with applications to robotics, autonomous driving, and reinforcement learning.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Latent Structure of Affective Representations in Large Language Models

Researchers investigate how large language models represent emotions in their latent spaces, discovering that LLMs develop coherent emotional representations aligned with established psychological models of valence and arousal. The findings support the linear representation hypothesis used in AI transparency methods and demonstrate practical applications for uncertainty quantification in emotion processing tasks.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Thinking in Latents: Adaptive Anchor Refinement for Implicit Reasoning in LLMs

Researchers introduce AdaAnchor, a new AI reasoning framework that performs silent computation in latent space rather than generating verbose step-by-step reasoning. The system adaptively determines when to stop refining its internal reasoning process, achieving up to 5% better accuracy while reducing token generation by 92-93% and cutting refinement steps by 48-60%.

AIBullisharXiv – CS AI · Mar 36/103

🧠

Latent Diffusion Model without Variational Autoencoder

Researchers introduce SVG, a new latent diffusion model that eliminates the need for variational autoencoders by using self-supervised representations. The approach leverages frozen DINO features to create semantically structured latent spaces, enabling faster training, fewer sampling steps, and better generative quality while maintaining semantic capabilities.

AINeutralarXiv – CS AI · Mar 34/103

🧠

CodecFlow: Efficient Bandwidth Extension via Conditional Flow Matching in Neural Codec Latent Space

CodecFlow is a new neural codec-based framework for speech bandwidth extension that efficiently reconstructs high-quality audio in compact latent space. The system uses conditional flow matching and residual vector quantization to improve speech clarity by restoring high-frequency content from low-bandwidth audio.