AINeutralarXiv – CS AI · 7h ago6/10
🧠
Information-Theoretic Lower Bounds for Bit-Constrained Stochastic Optimization via a Reduction to Compressed Gaussian Mean Estimation
Researchers establish information-theoretic lower bounds for bit-constrained stochastic optimization, proving that B-bit quantized gradients require communication overhead of TB = Omega(d) and statistical complexity of T = Omega(sigma^2 d / eps^2 * max{1, d/B}). The work provides the first rigorous characterization of what's theoretically possible in low-precision pretraining, contrasting with existing empirical studies of FP8 and MXFP4 systems.