Between Amnesia and Chaos: A Memory Stability Expressivity Trilemma for Trainable Dissipative Oscillator Networks
Researchers demonstrate that training physical neural networks composed of nonlinear oscillators reveals a fundamental tradeoff: memory capacity, gradient stability, and dynamical expressivity cannot be simultaneously optimized because all three are governed by damping parameters. Empirical validation on a twenty-oscillator network confirms theoretical predictions, showing trained substrates outperform frozen ones only within a narrow optimal band that contracts as memory horizons increase.
This research challenges a foundational assumption in physical reservoir computing that substrates should remain frozen during training. By enabling end-to-end learning of mass, damping, and stiffness parameters through symplectic integration, the authors expose a fundamental constraint: damping simultaneously controls backward gradient decay (limiting credit propagation depth), forward Lyapunov sensitivity (determining stability requirements), and memory capacity. The trilemma emerges because achieving longer memory horizons requires less damping, which destabilizes gradients; conversely, stable training demands higher damping, which compresses memory capacity. Empirically, the twenty-oscillator experiments validate the theoretical framework—damping sweeps reveal monotonic Lyapunov exponent behavior crossing zero at a predicted stability threshold, and performance comparisons across nine memory horizons demonstrate learned substrates outperforming frozen ones at short horizons before the advantage reverses near eleven steps, matching theoretical predictions of band closure. Notably, trained models naturally converge toward the edge of chaos without explicit optimization for this regime, suggesting emergent self-organization in physical systems. The fivefold gap between analytical and empirical gradient thresholds indicates a detectable-versus-learnable distinction worth characterizing rather than eliminating through hyperparameter tuning. This work bridges theoretical dynamical systems analysis with practical deep learning, establishing when adaptive physical substrates justify their computational overhead. The findings have implications for neuromorphic computing, analog neural network design, and understanding optimization landscapes in systems governed by physical laws rather than arbitrary loss functions.
- →Memory, stability, and expressivity form an irreducible tradeoff governed by damping in trainable oscillator networks
- →Learned physical substrates outperform frozen ones only within a narrow band that contracts with longer memory requirements
- →Trained models spontaneously seek the edge of chaos, suggesting natural attractor behavior in physically constrained systems
- →Backward gradient decay and forward Lyapunov sensitivity impose fundamental limits on trainable horizon depths
- →Empirical learning thresholds exceed analytical predictions by approximately fivefold, revealing hidden structure in optimization dynamics