y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#representation-learning News & Analysis

85 articles tagged with #representation-learning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

85 articles
AINeutralarXiv – CS AI · May 116/10
🧠

Learning Cross-Atlas Consistent Brain Disorder Representations via Disentangled Multi-Atlas Functional Connectivity Learning

Researchers propose MADCLE, a machine learning framework that learns consistent brain disorder representations across multiple brain atlases by disentangling disease-related features from atlas-dependent and covariate factors. The approach demonstrates competitive performance on neurological disorder datasets (ADNI and ADHD-200) while addressing the fundamental problem that different brain parcellation schemes produce heterogeneous and sometimes contradictory functional connectivity representations.

AINeutralarXiv – CS AI · May 116/10
🧠

Predictive but Not Plannable: RC-aux for Latent World Models

Researchers present RC-aux, a lightweight auxiliary objective that improves latent world models for planning by addressing the spatiotemporal mismatch between short-horizon prediction training and long-horizon planning deployment. The method adds multi-horizon prediction and budget-conditioned reachability supervision to align learned representations with planning requirements, demonstrating improvements on goal-conditioned control tasks.

AINeutralarXiv – CS AI · May 115/10
🧠

ASPECT: Node-Level Adaptive Spectral Fusion for Graph Contrastive Learning

Researchers introduce ASPECT, a novel spectral graph contrastive learning method that adaptively fuses low- and high-frequency graph signals at the node level rather than uniformly across entire graphs. The approach demonstrates improved representation quality on both homophilic and heterophilic graph benchmarks, addressing limitations in existing graph-level fusion strategies.

AINeutralarXiv – CS AI · May 96/10
🧠

CRAFT: Forgetting-Aware Intervention-Based Adaptation for Continual Learning

Researchers introduce CRAFT, a continual learning framework for large language models that prevents catastrophic forgetting by learning low-rank interventions on hidden representations rather than updating model weights. The three-stage approach uses KL divergence-based routing and merging to enable models to acquire new capabilities while maintaining performance on previously learned tasks.

AIBullisharXiv – CS AI · May 96/10
🧠

Render, Don't Decode: Weight-Space World Models with Latent Structural Disentanglement

Researchers introduce NOVA, a world modeling framework that represents scene state as weights in implicit neural representations (INRs) rather than traditional encoded latent spaces. The approach eliminates decoder bottlenecks, achieves structural disentanglement of scene components, and enables controllable video generation on consumer GPUs with only 40M parameters.

AINeutralarXiv – CS AI · May 76/10
🧠

Dissociating spatial frequency reliance from adversarial robustness advantages in neurally guided deep convolutional neural networks

Researchers challenge the assumption that neural alignment improves adversarial robustness in deep learning models by reducing reliance on high-frequency image details. Their experiments reveal that spatial-frequency bias is likely a byproduct rather than the primary mechanism, suggesting robustness improvements stem from learning human-like visual representations through more complex means.

AINeutralarXiv – CS AI · May 46/10
🧠

Representation in large language models

A research paper argues that Large Language Models operate partly through representation-based information processing rather than pure memorization, settling a fundamental debate in AI theory. This finding has implications for understanding whether LLMs possess genuine cognitive capabilities like beliefs, concepts, and understanding.

AINeutralarXiv – CS AI · May 16/10
🧠

Why Self-Supervised Encoders Want to Be Normal

Researchers develop a theoretical framework connecting Information Bottleneck principles to encoder-decoder learning through rate-distortion analysis, showing optimal representations form soft clusters on probability manifolds. The work introduces Sketched Isotropic Gaussian Regularization (SIGReg) as a principled regularizer for self-supervised, semi-supervised, and supervised learning without requiring variational bounds.

AINeutralarXiv – CS AI · Apr 206/10
🧠

Self-Distillation as a Performance Recovery Mechanism for LLMs: Counteracting Compression and Catastrophic Forgetting

Researchers introduce Self-Distillation Fine-Tuning (SDFT), a framework that recovers performance degradation in Large Language Models caused by compression, quantization, and catastrophic forgetting. Using Centered Kernel Alignment analysis, the study demonstrates that self-distillation works by aligning the student model's high-dimensional manifold with the teacher model's optimal representation structure.

AINeutralarXiv – CS AI · Apr 156/10
🧠

Identity as Attractor: Geometric Evidence for Persistent Agent Architecture in LLM Activation Space

Researchers demonstrate that large language models develop attractor-like geometric patterns in their activation space when processing identity documents describing persistent agents. Experiments on Llama 3.1 and Gemma 2 show paraphrased identity descriptions cluster significantly tighter than structural controls, suggesting LLMs encode semantic agent identity as stable attractors independent of linguistic variation.

🧠 Llama
AINeutralarXiv – CS AI · Apr 146/10
🧠

A Minimal Model of Representation Collapse: Frustration, Stop-Gradient, and Dynamics

Researchers present a minimal mathematical model demonstrating how representation collapse occurs in self-supervised learning when frustrated (misclassified) samples exist, and show that stop-gradient techniques prevent this failure mode. The work provides closed-form analysis of gradient-flow dynamics and fixed points, offering theoretical insights into why modern embedding-based learning systems sometimes lose discriminative power.

AIBullisharXiv – CS AI · Apr 146/10
🧠

Closed-Form Concept Erasure via Double Projections

Researchers present a novel closed-form method for concept erasure in generative AI models that removes unwanted concepts without iterative training. The technique uses linear transformations and two sequential projection steps to safely edit pretrained models like Stable Diffusion and FLUX while preserving unrelated concepts, completing the process in seconds.

🧠 Stable Diffusion
AINeutralarXiv – CS AI · Apr 146/10
🧠

Shared Emotion Geometry Across Small Language Models: A Cross-Architecture Study of Representation, Behavior, and Methodological Confounds

Researchers demonstrate that five mature small language model architectures (1.5B-8B parameters) share nearly identical emotion vector representations despite exhibiting opposite behavioral profiles, suggesting emotion geometry is a universal feature organized early in model development. The study also deconstructs prior emotion-vector research methodology into four distinct layers of confounding factors, revealing that single correlations between studies cannot safely establish comparability.

🧠 Llama
AINeutralarXiv – CS AI · Apr 146/10
🧠

Why Steering Works: Toward a Unified View of Language Model Parameter Dynamics

Researchers present a unified framework for understanding how different methods control large language models—including fine-tuning, LoRA, and activation interventions—revealing a fundamental trade-off between steering strength and output quality. The analysis explains this through an activation manifold perspective and introduces SPLIT, a new steering method that improves control while better preserving model coherence.

AINeutralarXiv – CS AI · Apr 136/10
🧠

Silhouette Loss: Differentiable Global Structure Learning for Deep Representations

Researchers introduce Soft Silhouette Loss, a novel machine learning objective that improves deep neural network representations by enforcing intra-class compactness and inter-class separation. The lightweight differentiable loss outperforms cross-entropy and supervised contrastive learning when combined, achieving 39.08% top-1 accuracy compared to 37.85% for existing methods while reducing computational overhead.

AINeutralarXiv – CS AI · Apr 136/10
🧠

From Selection to Scheduling: Federated Geometry-Aware Correction Makes Exemplar Replay Work Better under Continual Dynamic Heterogeneity

Researchers propose FEAT, a federated learning method that improves continual learning by addressing class imbalance and representation collapse across distributed clients. The approach combines geometric alignment and energy-based correction to better utilize exemplar samples while maintaining performance under dynamic heterogeneity.

AINeutralarXiv – CS AI · Apr 106/10
🧠

A Lightweight Library for Energy-Based Joint-Embedding Predictive Architectures

Facebook Research releases EB-JEPA, an open-source library for learning representations through Joint-Embedding Predictive Architectures that predict in representation space rather than pixel space. The framework demonstrates strong performance across image classification (91% on CIFAR-10), video prediction, and action-conditioned world models, making self-supervised learning more accessible for research and practical applications.

AINeutralarXiv – CS AI · Apr 76/10
🧠

Extracting and Steering Emotion Representations in Small Language Models: A Methodological Comparison

Researchers conducted the first comprehensive analysis of emotion representations in small language models (100M-10B parameters), finding that these models do possess internal emotion vectors similar to larger frontier models. The study evaluated 9 models across 5 architectural families and discovered that emotion representations localize at middle transformer layers, with generation-based extraction methods proving superior to comprehension-based approaches.

🏢 Perplexity🧠 Llama
AIBullisharXiv – CS AI · Apr 66/10
🧠

SmartCLIP: Modular Vision-language Alignment with Identification Guarantees

Researchers introduce SmartCLIP, a new AI model that improves upon CLIP by addressing information misalignment issues between images and text through modular vision-language alignment. The approach enables better disentanglement of visual representations while preserving cross-modal semantic information, demonstrating superior performance across various tasks.

AIBullisharXiv – CS AI · Apr 66/10
🧠

The More, the Merrier: Contrastive Fusion for Higher-Order Multimodal Alignment

Researchers introduce Contrastive Fusion (ConFu), a new multimodal machine learning framework that aligns individual modalities and their fused combinations in a unified representation space. The approach captures higher-order dependencies between multiple modalities while maintaining strong pairwise relationships, demonstrating competitive performance on retrieval and classification tasks.

AIBullisharXiv – CS AI · Mar 55/10
🧠

Topological Alignment of Shared Vision-Language Embedding Space

Researchers introduce ToMCLIP, a new framework that improves multilingual vision-language models by using topological alignment to better preserve the geometric structure of shared embedding spaces. The method shows enhanced performance on zero-shot classification and multilingual image retrieval tasks.

AIBullisharXiv – CS AI · Mar 55/10
🧠

Weight Space Representation Learning via Neural Field Adaptation

Researchers have developed a new approach using multiplicative LoRA (Low-Rank Adaptation) weights for neural field representation learning, achieving improved quality in reconstruction, generation, and analysis tasks. The method constrains optimization space through pre-trained base models, creating structured weight representations that outperform existing weight-space methods when used with latent diffusion models.

AINeutralarXiv – CS AI · Mar 37/108
🧠

AG-REPA: Causal Layer Selection for Representation Alignment in Audio Flow Matching

Researchers introduce AG-REPA, a new method for improving audio generation models by strategically selecting which neural network layers to align with teacher models. The approach identifies that layers storing the most information aren't necessarily the most important for generation, leading to better performance in speech and audio synthesis.

AINeutralLil'Log (Lilian Weng) · Sep 286/10
🧠

Anatomize Deep Learning with Information Theory

Professor Naftali Tishby applied information theory to analyze deep neural network training, proposing the Information Bottleneck method as a new learning bound for DNNs. His research identified two distinct phases in DNN training: first representing input data to minimize generalization error, then compressing representations by forgetting irrelevant details.

← PrevPage 3 of 4Next →