y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#distribution-shift News & Analysis

34 articles tagged with #distribution-shift. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

34 articles
AIBullisharXiv – CS AI · Jun 57/10
🧠

EpiEvolve: Self-Evolving Agents for Streaming Pandemic Forecasting under Regime Shifts

Researchers introduce EpiEvolve, a self-evolving AI agent that improves pandemic forecasting by adapting to changing disease patterns in real-time streaming scenarios. The system achieves 12% higher accuracy than static models and reduces recovery time after major shifts from 5 weeks to 2 weeks by leveraging episodic memory and strategic rule learning.

AINeutralarXiv – CS AI · Jun 27/10
🧠

Shortcut to Nowhere: Demystifying Deep Spurious Regression

Researchers introduce Deep Spurious Regression (DSR), a framework addressing how machine learning models rely on unreliable correlations when predicting continuous values rather than categorical labels. The work identifies a critical gap in AI robustness research, which has largely focused on classification tasks, and proposes techniques to improve model generalization across different data distributions by calibrating feature and label spaces.

AIBearisharXiv – CS AI · May 297/10
🧠

Do Physics Foundation Models Learn Generalizable Physics? A Bias-Aware Benchmark Across Physical Regimes and Distribution Shifts

Researchers benchmarked five physics foundation models across 8 physical dynamics and 25 test regimes, revealing that current models function as conditional rather than universal generalists. The study demonstrates that model performance heavily depends on physical regime, temporal scale, and distribution shifts, with pretraining and scaling unable to reliably overcome these limitations.

AIBullisharXiv – CS AI · May 127/10
🧠

Do Linear Probes Generalize Better in Persona Coordinates?

Researchers propose using 'persona coordinates'—low-dimensional subspaces derived from contrasting harmful and harmless model behaviors—to improve the generalization of linear probes that monitor language models for deception and harmful outputs. Testing across 10 datasets shows that probes trained on persona-derived directions significantly outperform those trained on raw model activations, addressing a critical gap in AI safety monitoring.

AIBullisharXiv – CS AI · Mar 177/10
🧠

OrthoFormer: Instrumental Variable Estimation in Transformer Hidden States via Neural Control Functions

Researchers propose OrthoFormer, a new Transformer architecture that addresses causal learning limitations by embedding instrumental variable estimation directly into neural networks. The framework aims to distinguish between spurious correlations and true causal mechanisms, potentially improving AI model robustness and reliability under distribution shifts.

AINeutralarXiv – CS AI · Mar 37/104
🧠

The Information-Theoretic Imperative: Compression and the Epistemic Foundations of Intelligence

Researchers propose the Compression Efficiency Principle (CEP) to explain why artificial neural networks and biological brains develop similar representations despite different substrates. The theory suggests both systems converge on efficient compression strategies that encode stable invariants rather than unstable correlations, providing a unified framework for understanding intelligence across biological and artificial systems.

AINeutralarXiv – CS AI · 2d ago6/10
🧠

TAPIOCA: Why Task- Aware Pruning Improves OOD model Capability

Researchers demonstrate that task-aware layer pruning improves model performance on out-of-distribution (OOD) data while providing no benefits for in-distribution data. The improvement occurs because pruning removes layers that distort the task-adapted geometric representation, realigning OOD inputs with the model's learned task geometry.

AINeutralarXiv – CS AI · 4d ago6/10
🧠

LargeMonitor: Monitoring Online Task-Free Continual Learning via Large Pretrained Models

LargeMonitor is a new framework that uses large pretrained foundation models to detect and diagnose distribution shifts in online task-free continual learning systems without requiring explicit task labels or training-coupled optimization. The approach decouples drift detection from adaptation strategy selection, enabling more precise responses to different types of data stream variations.

AINeutralarXiv – CS AI · 4d ago6/10
🧠

Difference-Aware Retrieval Policies for Imitation Learning

Researchers present DARP, a semi-parametric retrieval-based approach to imitation learning that improves upon standard behavior cloning by predicting actions based on k-nearest neighbors from training data rather than learning a global policy. The method achieves 15-46% performance improvements across continuous control and robotic manipulation tasks without requiring additional data collection or expert feedback.

AINeutralarXiv – CS AI · 4d ago6/10
🧠

When Tabular Foundation Models Meet Strategic Tabular Data: A Prior Alignment Approach

Researchers propose Strategic Prior-data Fitted Network (SPN), a framework addressing how tabular foundation models fail when users strategically manipulate data post-deployment. The method adapts pretrained models to strategic environments through inference-time adjustments without retraining, demonstrating improved robustness on real-world datasets.

AINeutralarXiv – CS AI · 5d ago6/10
🧠

SV-Detect: AI-generated Text Detection with Steering Vectors

Researchers have developed SV-Detect, an AI detection system using steering vectors extracted from language model hidden layers to distinguish human-written from machine-generated text. The method demonstrates robust performance across domain shifts, different source models, and edited content, positioning fake-text detection as a representation-space probing problem rather than surface-level analysis.

AINeutralarXiv – CS AI · Jun 55/10
🧠

Bridging Domain Expertise and Generalization for Performance Estimation

Researchers propose FRAP (Fused Reference Alignment Prediction), a method that combines a foundation model with a domain-specific base model to improve performance estimation when AI models encounter distribution shifts. By aligning and fusing predictions from both models through calibration, FRAP provides more reliable performance indicators without ground-truth labels.

AIBullisharXiv – CS AI · Jun 46/10
🧠

ADAPTOOD: Uncertainty-Aware Fine-Tuning for Out-of-Distribution ECG Time Series Models

Researchers introduce ADAPTOOD, a framework that uses data uncertainty to improve machine learning model performance on out-of-distribution time series data, particularly for ECG analysis. The method achieves up to 7% higher accuracy than existing approaches by quantifying distribution shift severity and adapting hyperparameters accordingly, addressing a critical challenge in deploying medical AI models across diverse real-world settings.

AINeutralarXiv – CS AI · Jun 26/10
🧠

Behavior-Invariant Task Representation Learning with Transformer-based World Models for Offline Meta-Reinforcement Learning

Researchers propose a novel offline meta-reinforcement learning framework combining information-theoretic task representation learning with Transformer-based world models to address distribution shifts in sparse-reward environments. The approach extracts behavior-invariant task representations and applies conservative value penalties to prevent model exploitation, demonstrating improved generalization over existing methods.

AINeutralarXiv – CS AI · Jun 26/10
🧠

Task diversity produces systematic transfer but inhibits continual reinforcement learning

Researchers introduce Banyan, a benchmark for studying continual reinforcement learning that reveals task diversity improves immediate transfer between tasks but fails to sustain learning across multiple distribution shifts. While agents trained on diverse tasks generalize well to new task distributions, they forget earlier tasks and struggle with longer-horizon objectives as training continues.

AIBullisharXiv – CS AI · Jun 26/10
🧠

Train, Test, Re-evaluate: Schedule-Sensitive Evaluation of Generative Data for Hand Detection

Researchers demonstrate that synthetic data generated through inpainting can effectively augment hand detection models for safety-critical applications when trained using multi-stage scheduling approaches. The study shows that combining real and synthetic data with strategic fine-tuning improves detection accuracy on out-of-distribution scenarios like gloved hands, addressing a critical gap in occupational safety systems.

AINeutralarXiv – CS AI · Jun 16/10
🧠

What changes after deployment? A survey on On-device Learning in TinyML

This survey examines on-device learning (ODL) in TinyML systems, analyzing how 70 existing solutions address the challenge of distribution shift in deployed machine learning models on microcontrollers. The research identifies a critical gap between academic benchmarks and real-world deployment scenarios, emphasizing that different types of distribution change require tailored technical approaches.

AINeutralarXiv – CS AI · Jun 16/10
🧠

Entropic Projection Alignment: Estimating, Explaining, and Improving Model Performance Under Distribution Shift

Researchers propose Entropic Projection Alignment (EPA), a machine learning framework that addresses distribution shift—when models encounter data different from their training set. The method estimates performance on unlabeled target domains, identifies responsible features, and improves accuracy through moment matching and closed-form importance weights, offering both theoretical guarantees and computational efficiency.

AINeutralarXiv – CS AI · Jun 16/10
🧠

Target-Agnostic Calibration under Distribution Shift with Frequency-Aware Gradient Rectification

Researchers propose Frequency-aware Gradient Rectification (FGR), a training framework that improves neural network calibration under distribution shifts without requiring access to target domains. The method uses low-pass filtering to reduce spurious patterns while maintaining in-distribution performance through geometric constraint projection.

AINeutralarXiv – CS AI · May 296/10
🧠

Entropy-KL Divergence-based Token Masking: A Novel Approach for Selective Fine-tuning of Large Language Models

Researchers propose EKSFT, a novel fine-tuning method that selectively masks high-entropy and high-KL divergence tokens during supervised fine-tuning of large language models. The approach aims to preserve pre-trained model distributions while efficiently activating task-relevant capabilities in low-data regimes, demonstrating improved performance on mathematical reasoning benchmarks.

AINeutralarXiv – CS AI · May 286/10
🧠

On the Learnability of Test-Time Adaptation: A Recovery Complexity Perspective

Researchers introduce the first theoretical framework for analyzing test-time adaptation (TTA) in machine learning, establishing recovery complexity bounds that reveal fundamental limits on how quickly models can adapt to non-stationary data streams without labeled data. The work provides mathematical guarantees for TTA learnability and identifies an intrinsic trade-off between adaptivity and information constraints.

AINeutralarXiv – CS AI · May 276/10
🧠

From Static Context to Calibrated Interactive RL: Mitigating Distribution Shift in Multi-turn Dialogue with Aligned Simulator

Researchers propose Calibrated Interactive RL, a framework addressing distribution shift problems in multi-turn dialogue systems by combining interactive reinforcement learning with simulator alignment. The approach theoretically and empirically demonstrates that aligning simulators with human interaction patterns significantly improves LLM-based dialogue agent performance compared to static context and unaligned interactive methods.

AINeutralarXiv – CS AI · May 276/10
🧠

SL-BiLEM: Structured Learnable Behavior-in-the-Loop Epidemic Modeling for Forecasting and Policy Evaluation

Researchers introduce SL-BiLEM, a machine learning framework that improves epidemic forecasting by accounting for how human behavior changes in response to disease spread and policy interventions. The model uses physical constraints to maintain accuracy even when facing novel policy scenarios, demonstrating 76% improvement over existing neural baselines and potential applications for public health decision-making.

AINeutralarXiv – CS AI · May 126/10
🧠

Reasoning Is Not Free: Robust Adaptive Cost-Efficient Routing for LLM-as-a-Judge

Researchers demonstrate that reasoning-capable LLMs improve judgment accuracy significantly on complex tasks like math and coding, but offer minimal or negative benefits on simpler evaluations while consuming substantially more computational resources. They introduce RACER, an adaptive routing algorithm that dynamically selects between reasoning and non-reasoning judges under budget constraints while accounting for distribution shifts.

Page 1 of 2Next →