SPADE introduces a machine learning framework that adaptively decides whether to enforce physical-structure priors (conservation laws, Hamiltonian forms) based on data evidence, using statistical tests and shrinkage estimation. The method automatically calibrates prior enforcement strength and selects among competing structures, achieving oracle-level performance while reducing computational overhead compared to cross-validation approaches.
SPADE addresses a fundamental challenge in scientific machine learning: when and how strongly to constrain models with domain-specific physical priors. Traditional approaches either rigidly enforce chosen constraints or tune penalties without principled guidance, creating a critical gap between theoretical ideals and practical implementation. This research introduces a statistical framework treating prior enforcement as a shrinkage problem, where a specification test determines whether data supports the constraint before applying it.
The framework builds on established statistical theory—particularly Stein-unbiased shrinkage and hypothesis testing—adapted for scientific computing contexts. By separating the decision to enforce a prior from the strength of enforcement, SPADE enables practitioners to quantify when domain knowledge genuinely improves predictions versus when it introduces harmful bias. The method's ability to handle nested and non-nested constraint families while controlling false discovery rates positions it as broadly applicable across scientific domains.
For practitioners in physics-informed machine learning, materials science, and systems modeling, SPADE offers substantial computational and accuracy improvements. Results demonstrate 2.6% error reduction versus 10.3% when using correct priors naively, plus 71× fewer solver calls than cross-validation. This efficiency matters particularly for computationally intensive physical simulations where model evaluation dominates runtime.
The framework's importance extends beyond incremental optimization—it provides principled decision rules for a problem scientists face constantly but currently solve through intuition or extensive hyperparameter search. Future applications likely include extending the approach to more complex constraint families and integrating with neural network architectures used in physics-informed learning.
- →SPADE uses statistical specification tests to determine when physical priors help versus harm model predictions, eliminating manual tuning.
- →The framework achieves oracle-level performance with O(σ²/n) convergence guarantees using Stein-unbiased shrinkage estimation.
- →Computational efficiency improves dramatically with 1/71th the solver calls required by cross-validation methods.
- →The method selects correct structures with 100% accuracy across linear, conservation law, and nonlinear Hamiltonian priors.
- →Benjamini-Hochberg control enables reliable subset discovery when only partial physical laws apply to a system.