Zero-Inflated Gaussian Distributions Enable Parameter-Space Sparsity in Estimation-of-Distribution Algorithms
Researchers introduce zero-inflated Gaussian (ZIG) distributions for estimation-of-distribution algorithms (EDAs) to optimize sparse parameter spaces where most solution coefficients are zero. This approach eliminates the need for hand-crafted sparsity operators and outperforms existing sparse optimization methods on benchmarks.
This research addresses a fundamental limitation in black-box optimization for sparse problems. Estimation-of-distribution algorithms have long excelled at continuous optimization by learning probability distributions from high-performing solutions, avoiding the bias inherent in manually designed genetic operators. However, EDAs have struggled with sparse optimization—problems where optimal solutions contain mostly zeros, common in feature selection, neural network pruning, and signal processing. Existing sparse optimizers resort to the very hand-crafted operators that EDAs were designed to overcome, using thresholds, bi-level schemes, and domain-specific heuristics. The zero-inflated Gaussian approach elegantly solves this by modeling sparsity patterns and active parameter values jointly through a latent Gaussian framework. This unified representation captures interactions between which parameters are active and their numerical values, enabling automatic discovery of sparsity structure without explicit operator design. The mathematical contribution includes proving parameter identifiability—a non-trivial result since similar constructions in missing-data problems typically lack this property—and developing amortized estimators for practical implementation. Experimental validation on the Lunar Lander control benchmark demonstrates measurable advantages: faster convergence, higher final performance, and discovery of effective sparse controllers. This work matters for machine learning and optimization communities because it extends a theoretically principled optimization framework to an important problem class. The methodology could accelerate development of efficient controllers and models that operate with minimal active parameters, reducing computational costs and improving interpretability in domains from robotics to neural architecture optimization.
- →Zero-inflated Gaussian distributions enable EDAs to handle sparse optimization without hand-crafted sparsity operators.
- →The latent parameter model is mathematically identifiable and recovers correlation structures in observed samples.
- →ZIG-EDA outperforms dense Gaussian EDAs and existing sparse evolutionary algorithms on benchmark tasks.
- →The approach optimizes sparsity patterns and active parameter values jointly without hierarchical or bi-level schemes.
- →Controllers discovered by ZIG-EDA maintain performance while using only a small fraction of available parameters.