#distribution-shift News & Analysis

34 articles tagged with #distribution-shift. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

34 articles

AINeutralarXiv – CS AI · May 126/10

🧠

Rethinking Entropy Minimization in Test-Time Adaptation for Autoregressive Models

Researchers present a unified mathematical framework for Test-Time Adaptation (TTA) in autoregressive generative models, decomposing entropy minimization into token-level policy gradient and entropy losses. Validated on Whisper ASR across 20+ domains, the approach demonstrates consistent performance improvements and reconciles previously disparate adaptation methods under a single theoretical foundation.

AINeutralarXiv – CS AI · May 126/10

🧠

Normalization Equivariance for Arbitrary Backbones, with Application to Image Denoising

Researchers present a parameter-free wrapper method (WNE) that enforces Normalization Equivariance—robustness to brightness and contrast shifts—around any neural network backbone without architectural constraints. The approach characterizes NE as a normalize-process-denormalize factorization, enabling compatibility with modern components like transformers and attention mechanisms while avoiding the 1.6x computational overhead of existing methods.

AINeutralarXiv – CS AI · May 46/10

🧠

TimeRFT: Stimulating Generalizable Time Series Forecasting for TSFMs via Reinforcement Finetuning

Researchers introduce TimeRFT, a reinforcement learning-based fine-tuning method for Time Series Foundation Models that improves forecasting accuracy and generalization. By implementing temporal reward mechanisms and intelligent data selection, TimeRFT outperforms traditional supervised fine-tuning approaches across diverse forecasting tasks and data conditions.

AINeutralarXiv – CS AI · Apr 206/10

🧠

Beyond Surface Statistics: Robust Conformal Prediction for LLMs via Internal Representations

Researchers propose a conformal prediction framework for large language models that uses internal neural representations rather than surface-level outputs to assess reliability and uncertainty. The Layer-Wise Information scoring method improves prediction validity under distribution shift while maintaining competitive performance, addressing a critical challenge in deploying LLMs where traditional uncertainty signals become unreliable.

AINeutralarXiv – CS AI · Apr 146/10

🧠

When Valid Signals Fail: Regime Boundaries Between LLM Features and RL Trading Policies

Researchers demonstrate that large language models can extract predictive features from financial news with valid intermediate signals (Information Coefficient >0.15), yet these features fail to improve reinforcement learning trading agents during macroeconomic shocks. The findings reveal a critical gap between feature-level validity and downstream policy robustness, suggesting that valid signals alone cannot guarantee trading performance under distribution shifts.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Understanding Generalization in Role-Playing Models via Information Theory

Researchers introduce R-EMID, an information-theoretic metric to diagnose how distribution shifts degrade role-playing model performance in real-world deployments. The framework reveals that user shifts pose the greatest generalization risk, while co-evolving reinforcement learning provides the most effective mitigation strategy.

AINeutralarXiv – CS AI · Mar 35/104

🧠

Spurious Correlation-Aware Embedding Regularization for Worst-Group Robustness

Researchers propose SCER (Spurious Correlation-Aware Embedding Regularization), a new deep learning approach that improves AI model robustness by regularizing feature representations to suppress spurious correlations. The method demonstrates superior performance in worst-group accuracy across vision and language tasks compared to existing state-of-the-art approaches.

AIBullisharXiv – CS AI · Mar 26/109

🧠

ProtoDCS: Towards Robust and Efficient Open-Set Test-Time Adaptation for Vision-Language Models

Researchers propose ProtoDCS, a new framework for robust test-time adaptation of Vision-Language Models in open-set scenarios. The method uses Gaussian Mixture Model verification and uncertainty-aware learning to better handle distribution shifts while maintaining computational efficiency.

AINeutralarXiv – CS AI · Mar 54/10

🧠

BD-Merging: Bias-Aware Dynamic Model Merging with Evidence-Guided Contrastive Learning

Researchers introduce BD-Merging, a new AI framework that improves model merging for multi-task learning by addressing bias and distribution shift issues. The method uses uncertainty modeling and contrastive learning to create more reliable AI systems that can better handle real-world data variations.

← PrevPage 2 of 2