AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers introduce POLCA (Prioritized Optimization with Local Contextual Aggregation), a new framework that uses large language models as optimizers for complex systems like AI agents and code generation. The method addresses stochastic optimization challenges through priority queuing and meta-learning, demonstrating superior performance across multiple benchmarks including agent optimization and CUDA kernel generation.
AINeutralarXiv – CS AI · Jun 26/10
🧠Researchers introduce a new class of asynchronous adaptive first-order optimization methods that improve upon existing algorithms through momentum and inexact normalization variants. The methods achieve O(1/√t) convergence rates in stochastic non-convex settings and demonstrate practical relevance for large-scale heterogeneous machine learning systems.
AIBullisharXiv – CS AI · Jun 26/10
🧠Researchers introduce S3TS, a novel algorithm combining Monte Carlo Tree Search with stochastic optimization to handle both non-linear complexity and uncertainty in energy grid scheduling. The approach demonstrates near-optimal performance in linear settings and significantly outperforms existing methods in non-linear scenarios, achieving up to 51% cost reductions compared to baseline algorithms.
AINeutralarXiv – CS AI · Jun 26/10
🧠Researchers establish information-theoretic lower bounds for bit-constrained stochastic optimization, proving that B-bit quantized gradients require communication overhead of TB = Omega(d) and statistical complexity of T = Omega(sigma^2 d / eps^2 * max{1, d/B}). The work provides the first rigorous characterization of what's theoretically possible in low-precision pretraining, contrasting with existing empirical studies of FP8 and MXFP4 systems.
AINeutralarXiv – CS AI · Jun 26/10
🧠Researchers propose a new decision-focused learning method using score function gradient estimation and stochastic smoothing to train machine learning models that directly optimize for task performance rather than prediction accuracy. The approach removes restrictive assumptions about problem structure, extending applicability to nonlinear objectives, constrained optimization, and two-stage stochastic problems.
AINeutralarXiv – CS AI · May 125/10
🧠Researchers resolve an open problem in multi-armed bandit theory by characterizing how best-action oracle queries improve learning algorithms in the realistic bandit-feedback model. They prove that benefits depend critically on reward structure: correlated stochastic rewards cannot achieve the theoretical gains seen in full-feedback settings, while i.i.d. stochastic rewards maintain near-optimal improvements with logarithmic precision.
AINeutralarXiv – CS AI · Apr 146/10
🧠Researchers develop a new information-theoretic framework that handles heavy-tailed data distributions, addressing limitations in classical generalization bounds used in machine learning. The work applies specifically to reinforcement learning from human feedback (RLHF) and stochastic gradient optimization, where traditional KL-divergence tools fail due to non-existent moment generating functions.