y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#feature-learning News & Analysis

6 articles tagged with #feature-learning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

6 articles
AIBullisharXiv – CS AI · Jun 96/10
🧠

Muon Learns More Robust and Transferable Features than Adam

Research demonstrates that Muon, an emerging optimizer for large language models and vision classifiers, produces more robust and transferable features than Adam and SGD across multiple architectures. The study shows Muon-learned features maintain superior performance on corrupted data and transfer more effectively to downstream tasks, with theoretical support provided through margin and effective rank analysis.

AINeutralarXiv – CS AI · Jun 56/10
🧠

Class-Specific Branch Attention for Mitigating Gradient Interference under Class Imbalance

Researchers introduce Class-Specific Branch Attention (CSBA), a neural network modification that addresses gradient interference problems in deep learning models trained on imbalanced datasets. The technique achieves significant performance improvements for minority classes, nearly doubling the F1 score for underrepresented categories while maintaining overall accuracy.

AINeutralarXiv – CS AI · Jun 56/10
🧠

Scaling Laws and Spectra of Shallow Neural Networks in the Feature Learning Regime

Researchers present a theoretical framework analyzing scaling laws for shallow neural networks in the feature learning regime, deriving phase diagrams that connect sample complexity and weight decay to risk exponents. The work bridges empirical observations in deep learning with rigorous mathematical analysis, establishing links between weight spectrum properties and generalization performance through matrix compressed sensing and LASSO theory.

AINeutralarXiv – CS AI · May 126/10
🧠

Feature Repulsion and Spectral Lock-in: An Empirical Study of Two-Layer Network Grokking

Researchers empirically validate theoretical predictions about feature repulsion in neural network grokking, discovering that while the mathematical sign structure holds consistently across activation functions, the spectral signature of this mechanism in weight updates depends critically on activation type—appearing sharply in quadratic activations but remaining invisible in ReLU networks.

AIBullisharXiv – CS AI · Apr 106/10
🧠

Improving Robustness In Sparse Autoencoders via Masked Regularization

Researchers propose a masked regularization technique to improve the robustness and interpretability of Sparse Autoencoders (SAEs) used in large language model analysis. The method addresses feature absorption and out-of-distribution performance failures by randomly replacing tokens during training to disrupt co-occurrence patterns, offering a practical path toward more reliable mechanistic interpretability tools.

AIBullisharXiv – CS AI · Mar 96/10
🧠

Maximizing Asynchronicity in Event-based Neural Networks

Researchers have developed EVA (EVent Asynchronous feature learning), a new framework that improves event-based neural networks by adapting language modeling techniques to process asynchronous visual data from event cameras. EVA demonstrates superior performance on recognition and detection tasks, achieving breakthrough results including 0.477 mAP on the Gen1 dataset for demanding detection applications.