y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#multi-armed-bandits News & Analysis

4 articles tagged with #multi-armed-bandits. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

4 articles
AIBullisharXiv – CS AI · Mar 47/104
🧠

Learning Contextual Runtime Monitors for Safe AI-Based Autonomy

Researchers introduce a novel framework for learning context-aware runtime monitors for AI-based control systems in autonomous vehicles. The approach uses contextual multi-armed bandits to select the best controller for current conditions rather than averaging outputs, providing theoretical safety guarantees and improved performance in simulated driving scenarios.

AIBullisharXiv – CS AI · May 296/10
🧠

Compute Allocation in Evolutionary Search: From Depth-Breadth to Multi-Armed Bandits

Researchers propose BaSE, a multi-armed bandit algorithm that optimizes how large language models allocate computational resources during evolutionary search tasks. By dynamically distributing LLM calls across parallel trajectories, BaSE improves mean fitness by 12.3% over existing baselines while addressing the reliability gap between reported best-case and typical run performance.

AINeutralarXiv – CS AI · May 125/10
🧠

Multi-Armed Bandits With Best-Action Queries

Researchers resolve an open problem in multi-armed bandit theory by characterizing how best-action oracle queries improve learning algorithms in the realistic bandit-feedback model. They prove that benefits depend critically on reward structure: correlated stochastic rewards cannot achieve the theoretical gains seen in full-feedback settings, while i.i.d. stochastic rewards maintain near-optimal improvements with logarithmic precision.

AINeutralarXiv – CS AI · Mar 34/104
🧠

Near-Optimal Regret for KL-Regularized Multi-Armed Bandits

Researchers developed a new analysis of KL-regularized multi-armed bandits (MABs) using KL-UCB algorithm, achieving near-optimal regret bounds. The study provides the first high-probability regret bound with linear dependence on the number of arms and establishes matching lower bounds, offering comprehensive understanding across all regularization regimes.

$NEAR