AIBullisharXiv – CS AI · May 277/10
🧠Researchers introduce GAT, a transformer-based GAN architecture trained in VAE latent space that achieves state-of-the-art image generation performance. The model reaches FID 2.96 on ImageNet-256 in just 40 epochs, 6x faster than comparable baselines, while scaling reliably from small to extra-large capacities.
AINeutralarXiv – CS AI · Mar 57/10
🧠Researchers introduce History-Echoes, a framework revealing how large language models become trapped by their conversational history, with past interactions creating geometric constraints in latent space that bias future responses. The study demonstrates that behavioral persistence in LLMs manifests as mathematical traps where previous hallucinations and responses influence subsequent model behavior across multiple model families and datasets.
AIBullisharXiv – CS AI · Mar 57/10
🧠Researchers have developed a new method called Latent-Control Heads (LatCHs) that enables efficient control of audio generation in diffusion models with significantly reduced computational costs. The approach operates directly in latent space, avoiding expensive decoder steps and requiring only 7M parameters and 4 hours of training while maintaining audio quality.
AIBullisharXiv – CS AI · Mar 47/103
🧠Researchers introduce LaDiR (Latent Diffusion Reasoner), a novel framework that combines continuous latent representation with iterative refinement capabilities to enhance Large Language Models' reasoning abilities. The system uses a Variational Autoencoder to encode reasoning steps and a latent diffusion model for parallel generation of diverse reasoning trajectories, showing improved accuracy and interpretability in mathematical reasoning benchmarks.
AINeutralarXiv – CS AI · 16h ago6/10
🧠SymTRELLIS introduces a method to enforce geometric symmetries in 3D generative models without retraining underlying systems, using learned linear operators on voxel latents and velocity symmetrization during generation. The technique substantially reduces symmetry violations across rotational, reflectional, and polyhedral symmetries compared to existing models like TRELLIS.2 and Hunyuan3D-2.1.
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers introduce BRo-JEPA, a neural network architecture that learns modular arithmetic rules by imposing circular structure in latent space, achieving 99.46% zero-shot generalization on unseen operations. The work demonstrates that neural networks can learn abstract algebraic rules rather than merely memorizing patterns when architecture aligns with problem structure.
AINeutralarXiv – CS AI · 3d ago6/10
🧠Researchers propose a novel framework for controlling symbolic music generation in Transformer models through activation steering, enabling fine-grained control over musical attributes like pitch and duration without retraining. The approach uses latent space analysis and orthogonalization techniques to independently manipulate multiple attributes while reducing interference and maintaining generation quality.
AINeutralarXiv – CS AI · 3d ago6/10
🧠Lumos-Nexus is a new video generation framework that separates training and inference to improve both reasoning quality and visual fidelity. The system uses a lightweight generator during training and progressively hands off to a high-capacity generator during inference through a technique called Unified Progressive Frequency Bridging, while introducing VR-Bench as a benchmark for reasoning-driven video generation.
AINeutralarXiv – CS AI · 6d ago6/10
🧠Researchers propose Autoregression-Free Neural Operators (AFNO), a new approach for solving time-dependent partial differential equations that models continuous-time evolution in latent space rather than performing recursive predictions. By avoiding autoregressive rollout and using flow matching, AFNO reduces error accumulation over long-horizon predictions and demonstrates improved stability across six PDE benchmarks.
AINeutralarXiv – CS AI · May 276/10
🧠Falcon-X is a new time series foundation model that improves multivariate forecasting by mapping heterogeneous data types into a unified latent space rather than processing raw variables directly. The model uses novel attention mechanisms to capture both positive and negative relationships between variables, achieving state-of-the-art performance on forecasting benchmarks.
AINeutralarXiv – CS AI · May 126/10
🧠diffGHOST is a new conditional diffusion model that synthesizes mobility trajectories while preserving privacy through latent space segmentation. The approach addresses a critical gap in existing generative models that lack formal privacy guarantees despite handling sensitive personal movement data.
AINeutralarXiv – CS AI · May 126/10
🧠NoisyCoconut is an inference-time method that improves LLM reliability by injecting controlled noise into internal representations to generate diverse reasoning paths, enabling models to abstain when uncertain without requiring retraining. The technique reduces error rates from 40-70% to below 15% on mathematical reasoning tasks through unanimous agreement among noise-perturbed paths, offering practical reliability improvements compatible with existing models.
AIBullisharXiv – CS AI · May 126/10
🧠Researchers have identified why diffusion transformers (DiTs) degrade in quality during multi-turn image editing and proposed VAE-LFA, a training-free alignment method that operates in VAE latent space to suppress accumulated semantic drift. The solution works with both white-box and black-box models by aligning low-frequency components across editing rounds while preserving high-frequency details.
AINeutralarXiv – CS AI · May 116/10
🧠Researchers have developed a multimodal latent diffusion model that simultaneously synthesizes MRI brain scans and clinical tabular data (age, sex, body measurements) within a shared latent space using cross-attention mechanisms. Tested on over 10,000 participants from the German National Cohort, the system generates anatomically plausible synthetic medical data where image and tabular attributes remain coherently aligned, representing the first successful joint modeling of volumetric medical images with mixed-type clinical data.
AIBullisharXiv – CS AI · May 96/10
🧠Researchers introduce Memory Inception (MI), a training-free method for steering large language models by inserting text-derived key-value banks at selected attention layers rather than caching full prompts. MI achieves competitive control with instruction prompting while using up to 118x less storage and outperforms existing activation steering methods on personality, reasoning, and guidance tasks.
AINeutralarXiv – CS AI · May 96/10
🧠Researchers propose Cola DLM, a hierarchical latent diffusion language model that generates text through continuous semantic modeling rather than traditional left-to-right autoregressive decoding. The approach achieves comparable performance to autoregressive models while offering greater flexibility, better scaling properties, and a potential pathway for unified modeling across discrete and continuous modalities.
AINeutralarXiv – CS AI · May 46/10
🧠Researchers propose Hamiltonian World Models, a physics-grounded approach to generative world modeling that encodes observations into structured latent phase spaces and evolves them through Hamiltonian-inspired dynamics. The framework aims to address limitations in current world models by prioritizing physical accuracy and action-controllability alongside visual realism, with applications to robotics, autonomous driving, and reinforcement learning.
AINeutralarXiv – CS AI · Apr 146/10
🧠Researchers investigate how large language models represent emotions in their latent spaces, discovering that LLMs develop coherent emotional representations aligned with established psychological models of valence and arousal. The findings support the linear representation hypothesis used in AI transparency methods and demonstrate practical applications for uncertainty quantification in emotion processing tasks.
AIBullisharXiv – CS AI · Mar 176/10
🧠Researchers introduce AdaAnchor, a new AI reasoning framework that performs silent computation in latent space rather than generating verbose step-by-step reasoning. The system adaptively determines when to stop refining its internal reasoning process, achieving up to 5% better accuracy while reducing token generation by 92-93% and cutting refinement steps by 48-60%.
AIBullisharXiv – CS AI · Mar 36/103
🧠Researchers introduce SVG, a new latent diffusion model that eliminates the need for variational autoencoders by using self-supervised representations. The approach leverages frozen DINO features to create semantically structured latent spaces, enabling faster training, fewer sampling steps, and better generative quality while maintaining semantic capabilities.
AINeutralarXiv – CS AI · Mar 34/103
🧠CodecFlow is a new neural codec-based framework for speech bandwidth extension that efficiently reconstructs high-quality audio in compact latent space. The system uses conditional flow matching and residual vector quantization to improve speech clarity by restoring high-frequency content from low-bandwidth audio.