y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#online-learning News & Analysis

13 articles tagged with #online-learning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

13 articles
AIBullisharXiv – CS AI · May 127/10
🧠

Continuous Latent Contexts Enable Efficient Online Learning in Transformers

Researchers demonstrate that transformer models equipped with continuous latent context tokens can efficiently implement online learning algorithms without parameter updates. A small GPT-2-style model trained with this approach outperforms much larger language models on synthetic online prediction tasks, suggesting a promising architectural direction for adaptive AI systems.

AIBullisharXiv – CS AI · May 97/10
🧠

Rethinking Data Curation in LLM Training: Online Reweighting Offers Better Generalization than Offline Methods

Researchers propose ADAPT, an online data reweighting framework that dynamically adjusts training sample importance during LLM training rather than using static offline selection methods. This approach maintains data diversity while improving generalization, outperforming existing offline curation techniques on instruction tuning and large-scale pretraining tasks.

AIBullisharXiv – CS AI · Mar 177/10
🧠

OpenClaw-RL: Train Any Agent Simply by Talking

OpenClaw-RL is a new reinforcement learning framework that enables AI agents to learn continuously from any type of interaction, including conversations, terminal commands, and GUI interactions. The system extracts learning signals from user responses and feedback, allowing agents to improve simply by being used in real-world scenarios.

AIBullisharXiv – CS AI · Mar 177/10
🧠

EARCP: Self-Regulating Coherence-Aware Ensemble Architecture for Sequential Decision Making -- Ensemble Auto-Regule par Coherence et Performance

Researchers introduce EARCP, a new ensemble architecture for AI that dynamically weights different expert models based on performance and coherence. The system provides theoretical guarantees with sublinear regret bounds and has been tested on time series forecasting, activity recognition, and financial prediction tasks.

AIBullisharXiv – CS AI · Mar 167/10
🧠

When Drafts Evolve: Speculative Decoding Meets Online Learning

Researchers introduce OnlineSpec, a framework that uses online learning to continuously improve draft models in speculative decoding for large language model inference acceleration. The approach leverages verification feedback to evolve draft models dynamically, achieving up to 24% speedup improvements across seven benchmarks and three foundation models.

AINeutralarXiv – CS AI · 6d ago6/10
🧠

The Sample Complexity of Multiclass and Sparse Contextual Bandits

Researchers present optimal algorithms for sparse contextual bandits that achieve sample complexity of Õ((s/ε² + |A|/ε)log|Π|/δ), closing a gap from prior work that had exponential dependence on action set size. The results apply to multiclass classification and combinatorial semi-bandits through information-theoretic and algorithmic approaches.

AIBullisharXiv – CS AI · May 286/10
🧠

EvoSpec: Evolving Speculative Decoding via Real-Time Vocabulary and Parameter AdaptationTarget

EvoSpec introduces a dynamic framework for accelerating Large Language Model inference through real-time adaptation of vocabulary and parameters in speculative decoding. By addressing the vocabulary bottleneck that causes performance degradation in specialized domains, EvoSpec achieves 1.13x speedup improvements over static baselines while reducing memory overhead by 27%.

AINeutralarXiv – CS AI · May 286/10
🧠

On the Learnability of Test-Time Adaptation: A Recovery Complexity Perspective

Researchers introduce the first theoretical framework for analyzing test-time adaptation (TTA) in machine learning, establishing recovery complexity bounds that reveal fundamental limits on how quickly models can adapt to non-stationary data streams without labeled data. The work provides mathematical guarantees for TTA learnability and identifies an intrinsic trade-off between adaptivity and information constraints.

AINeutralarXiv – CS AI · May 285/10
🧠

Online Irregular Multivariate Time Series Forecasting via Uncertainty-Driven Dual-Expert Calibration

Researchers propose Under-Cali, a machine learning framework for forecasting irregular multivariate time series data in real-time online settings. The system uses uncertainty estimation and dual-expert calibration to maintain accuracy despite dynamic data distribution shifts, achieving improvements over existing methods with minimal computational overhead.

AINeutralarXiv – CS AI · May 125/10
🧠

Multi-Armed Bandits With Best-Action Queries

Researchers resolve an open problem in multi-armed bandit theory by characterizing how best-action oracle queries improve learning algorithms in the realistic bandit-feedback model. They prove that benefits depend critically on reward structure: correlated stochastic rewards cannot achieve the theoretical gains seen in full-feedback settings, while i.i.d. stochastic rewards maintain near-optimal improvements with logarithmic precision.

AIBullisharXiv – CS AI · Mar 55/10
🧠

Online Learning for Multi-Layer Hierarchical Inference under Partial and Policy-Dependent Feedback

Researchers developed a new variance-reduced EXP4-based algorithm for optimizing routing policies in multi-layer hierarchical inference systems. The solution addresses the challenge of sparse, policy-dependent feedback in AI systems where prediction errors are only revealed at terminal layers, improving stability and performance over standard importance-weighted approaches.

AIBullishOpenAI News · Mar 256/107
🧠

Scaling the OpenAI Academy

OpenAI is expanding its Academy initiative into a comprehensive online resource hub designed to improve AI literacy across diverse backgrounds. The platform will provide tools, best practices, and peer insights to help users effectively access and utilize AI technologies.

AINeutralarXiv – CS AI · Mar 34/103
🧠

Reservoir Subspace Injection for Online ICA under Top-n Whitening

Researchers developed Reservoir Subspace Injection (RSI) to improve online Independent Component Analysis under nonlinear mixing conditions. The study identifies performance bottlenecks in top-n whitening and proposes a guarded RSI controller that preserves system performance while achieving 1.7 dB improvement over vanilla online ICA methods.