A Temporal Spatial Minimax Rate for Smoothly-Varying Distributions in Wasserstein Space
A new mathematical framework establishes minimax rates for predicting future probability distributions in Wasserstein space based on noisy observations of smoothly-varying curves. The research provides both lower bounds and conditional upper bounds for distribution estimation, revealing how prediction accuracy degrades with dimensionality and unobserved future time horizons.
This arXiv paper addresses a fundamental problem in statistical machine learning: predicting the evolution of probability distributions over time when only past observations are available. The researchers derive minimax rates that characterize the information-theoretic limits of such predictions, establishing what no algorithm can beat given the constraints of the problem. The temporal-spatial reduction framework proves particularly innovative, embedding transport packings along the time axis to connect temporal forecasting with spatial estimation complexity. The work reveals a critical trade-off between two sources of difficulty: the irreducible cost of predicting unobserved future dynamics (growing with smoothness assumptions) and the classical statistical curse of dimensionality in distribution estimation. For practitioners in machine learning and optimal transport theory, this establishes precise performance benchmarks. The dimension-dependent exponent γ_d = min(1/d, 1/2) shows that high-dimensional problems face fundamentally harder estimation rates, a phenomenon relevant to generative modeling and data-driven dynamics. The authors demonstrate matching upper bounds in restricted settings and provide numerical validation of theoretical predictions. While the general-k unconditional upper bound remains open, the established rates provide architects of forecasting systems with theoretical guidance on what accuracy to expect. The framework's design-dependent formulation accommodates arbitrary observation schedules, enhancing practical applicability beyond idealized dense sampling regimes.
- →Minimax rates for Wasserstein space distribution prediction scale as M^(-γ_d(k+1)/(k+1+γ_d)) where γ_d captures dimension-dependent difficulty
- →Temporal forecasting difficulty decomposes into unobserved-future cost (εh^(k+1)) and spatial estimation curse (M^(-γ_d))
- →Lower bounds apply to regular, locally transport-rich distribution classes under adiabatic smoothness constraints
- →Matching upper bounds established for k=0 and translation submodels; general-k upper bound remains open
- →Theoretical predictions corroborated by numerical experiments on synthetic curved and flat families