y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#machine-learning News & Analysis

2484 articles tagged with #machine-learning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2484 articles
AIBullisharXiv – CS AI Β· Mar 46/102
🧠

Is Retraining-Free Enough? The Necessity of Router Calibration for Efficient MoE Compression

Researchers propose Router Knowledge Distillation (Router KD) to improve retraining-free compression of Mixture-of-Experts (MoE) models by calibrating routers while keeping expert parameters unchanged. The method addresses router-expert mismatch issues that cause performance degradation in compressed MoE models, showing particularly strong results in fine-grained MoE architectures.

AINeutralarXiv – CS AI Β· Mar 47/104
🧠

SorryDB: Can AI Provers Complete Real-World Lean Theorems?

Researchers have introduced SorryDB, a dynamic benchmark for evaluating AI systems' ability to prove mathematical theorems using the Lean proof assistant. The benchmark draws from 78 real-world formalization projects and addresses limitations of static benchmarks by providing continuously updated tasks that better reflect community needs.

AIBullisharXiv – CS AI Β· Mar 46/104
🧠

Large Electron Model: A Universal Ground State Predictor

Researchers introduce Large Electron Model, a neural network that uses Fermi Sets architecture to predict ground state wavefunctions of interacting electrons across different Hamiltonian parameters. The model demonstrates accurate predictions for up to 50 particles and generalizes across unseen coupling strengths, potentially advancing material discovery beyond density functional theory limitations.

AIBullisharXiv – CS AI Β· Mar 47/103
🧠

Self-Play Only Evolves When Self-Synthetic Pipeline Ensures Learnable Information Gain

Researchers propose a framework for sustainable AI self-evolution through triadic roles (Proposer, Solver, Verifier) that ensures learnable information gain across iterations. The study identifies three key system designs to prevent the common plateau effect in self-play AI systems: asymmetric co-evolution, capacity growth, and proactive information seeking.

AIBullisharXiv – CS AI Β· Mar 47/102
🧠

RxnNano:Training Compact LLMs for Chemical Reaction and Retrosynthesis Prediction via Hierarchical Curriculum Learning

Researchers developed RxnNano, a compact 0.5B-parameter AI model for chemical reaction prediction that outperforms much larger 7B+ parameter models by 23.5% through novel training techniques focused on chemical understanding rather than scale. The framework uses hierarchical curriculum learning and chemical consistency objectives to improve drug discovery and synthesis planning applications.

$ATOM
AIBullisharXiv – CS AI Β· Mar 47/103
🧠

Density-Guided Response Optimization: Community-Grounded Alignment via Implicit Acceptance Signals

Researchers introduce Density-Guided Response Optimization (DGRO), a new AI alignment method that learns community preferences from implicit acceptance signals rather than explicit feedback. The technique uses geometric patterns in how communities naturally engage with content to train language models without requiring costly annotation or preference labeling.

AIBullisharXiv – CS AI Β· Mar 47/102
🧠

NExT-Guard: Training-Free Streaming Safeguard without Token-Level Labels

Researchers introduce NExT-Guard, a training-free framework for real-time AI safety monitoring that uses Sparse Autoencoders to detect unsafe content in streaming language models. The system outperforms traditional supervised training methods while requiring no token-level annotations, making it more cost-effective and scalable for deployment.

AIBullisharXiv – CS AI Β· Mar 47/103
🧠

Param$\Delta$ for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost

Researchers introduce Paramβˆ†, a novel method for transferring post-training capabilities to updated language models without additional training costs. The technique achieves 95% performance of traditional post-training by computing weight differences between base and post-trained models, offering significant cost savings for AI model development.

AIBullisharXiv – CS AI Β· Mar 46/102
🧠

Predicting Tuberculosis from Real-World Cough Audio Recordings and Metadata

Researchers developed an AI system that can detect tuberculosis from cough recordings with 70% accuracy using audio alone, improving to 81% when combined with clinical metadata. The study used real-world data from a phone-based app across Africa and Asia, suggesting mobile applications could enhance TB diagnosis in community health settings.

$CRV
AIBullisharXiv – CS AI Β· Mar 46/102
🧠

Rigidity-Aware Geometric Pretraining for Protein Design and Conformational Ensembles

Researchers introduce RigidSSL, a new geometric pretraining framework for protein design that improves designability by up to 43% and enhances success rates in protein generation tasks. The two-phase approach combines geometric learning from 432K protein structures with molecular dynamics refinement to better capture protein conformational dynamics.

AINeutralarXiv – CS AI Β· Mar 46/102
🧠

The Malignant Tail: Spectral Segregation of Label Noise in Over-Parameterized Networks

Researchers identify the 'Malignant Tail' phenomenon where over-parameterized neural networks segregate signal from noise during training, leading to harmful overfitting. They demonstrate that Stochastic Gradient Descent pushes label noise into high-frequency orthogonal subspaces while preserving semantic features in low-rank subspaces, and propose Explicit Spectral Truncation as a post-hoc solution to recover optimal generalization.

AIBullisharXiv – CS AI Β· Mar 46/102
🧠

Rethinking Code Similarity for Automated Algorithm Design with LLMs

Researchers introduce BehaveSim, a new method to measure algorithmic similarity by analyzing problem-solving behavior rather than code syntax. The approach enhances AI-driven algorithm design frameworks and enables systematic analysis of AI-generated algorithms through behavioral clustering.

AIBullisharXiv – CS AI Β· Mar 46/103
🧠

RAPO: Expanding Exploration for LLM Agents via Retrieval-Augmented Policy Optimization

Researchers introduce RAPO (Retrieval-Augmented Policy Optimization), a new reinforcement learning framework that improves LLM agent training by incorporating retrieval mechanisms for broader exploration. The method achieves 5% performance gains across 14 datasets and 1.2x faster training efficiency by using hybrid-policy rollouts and retrieval-aware optimization.

AINeutralarXiv – CS AI Β· Mar 47/103
🧠

Forecasting as Rendering: A 2D Gaussian Splatting Framework for Time Series Forecasting

Researchers introduce TimeGS, a novel time series forecasting framework that reimagines prediction as 2D generative rendering using Gaussian splatting techniques. The approach addresses key limitations in existing methods by treating future sequences as continuous latent surfaces and enforcing temporal continuity across periodic boundaries.

AINeutralarXiv – CS AI Β· Mar 46/102
🧠

Beyond Factual Correctness: Mitigating Preference-Inconsistent Explanations in Explainable Recommendation

Researchers propose PURE, a new framework for AI-powered recommendation systems that addresses preference-inconsistent explanations - where AI provides factually correct but unconvincing reasoning that conflicts with user preferences. The system uses a select-then-generate approach to improve both evidence selection and explanation generation, demonstrating reduced hallucinations while maintaining recommendation accuracy.

AINeutralarXiv – CS AI Β· Mar 47/103
🧠

Retrievit: In-context Retrieval Capabilities of Transformers, State Space Models, and Hybrid Architectures

Research compares Transformers, State Space Models (SSMs), and hybrid architectures for in-context retrieval tasks, finding hybrid models excel at information-dense retrieval while Transformers remain superior for position-based tasks. SSM-based models develop unique locality-aware embeddings that create interpretable positional structures, explaining their specific strengths and limitations.

AIBullisharXiv – CS AI Β· Mar 46/103
🧠

Concept Heterogeneity-aware Representation Steering

Researchers introduce CHaRS (Concept Heterogeneity-aware Representation Steering), a new method for controlling large language model behavior that uses optimal transport theory to create context-dependent steering rather than global directions. The approach models representations as Gaussian mixture models and derives input-dependent steering maps, showing improved behavioral control over existing methods.

AIBullisharXiv – CS AI Β· Mar 47/103
🧠

Social-JEPA: Emergent Geometric Isomorphism

Researchers developed Social-JEPA, showing that separate AI agents learning from different viewpoints of the same environment develop internal representations that are mathematically aligned through approximate linear isometry. This enables models trained on one agent to work on another without retraining, suggesting a path toward interoperable decentralized AI vision systems.

AIBullisharXiv – CS AI Β· Mar 47/102
🧠

Generalized Discrete Diffusion with Self-Correction

Researchers propose Self-Correcting Discrete Diffusion (SCDD), a new AI model that improves upon existing discrete diffusion models by reformulating self-correction with explicit state transitions. The method enables more efficient parallel decoding while maintaining generation quality, demonstrating improvements at GPT-2 scale.

AINeutralarXiv – CS AI Β· Mar 47/104
🧠

Estimating Visual Attribute Effects in Advertising from Observational Data: A Deepfake-Informed Double Machine Learning Approach

Researchers developed DICE-DML, a new framework that uses deepfake technology and machine learning to measure causal effects of visual attributes in digital advertising. The method addresses bias issues in standard approaches when analyzing how image elements like skin tone affect consumer engagement on social media platforms.

AINeutralarXiv – CS AI Β· Mar 47/105
🧠

Federated Inference: Toward Privacy-Preserving Collaborative and Incentivized Model Serving

Researchers introduce Federated Inference (FI), a new collaborative paradigm where independently trained AI models can work together at inference time without sharing data or model parameters. The study identifies key requirements including privacy preservation and performance gains, while highlighting system-level challenges that differ from traditional federated learning approaches.