#gradient-stability News & Analysis

2 articles tagged with #gradient-stability. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles

AIBullisharXiv – CS AI · May 117/10

🧠

DGPO: Distribution Guided Policy Optimization for Fine Grained Credit Assignment

Researchers introduce Distribution Guided Policy Optimization (DGPO), a novel reinforcement learning framework that improves how large language models learn to perform complex reasoning tasks by assigning credit at the token level rather than sequence level. DGPO replaces unstable KL divergence penalties with bounded Hellinger distance and adds an entropy gating mechanism, achieving state-of-the-art performance on challenging math benchmarks like AIME2024 and AIME2025.

AINeutralarXiv – CS AI · Jun 106/10

🧠

Between Amnesia and Chaos: A Memory Stability Expressivity Trilemma for Trainable Dissipative Oscillator Networks

Researchers demonstrate that training physical neural networks composed of nonlinear oscillators reveals a fundamental tradeoff: memory capacity, gradient stability, and dynamical expressivity cannot be simultaneously optimized because all three are governed by damping parameters. Empirical validation on a twenty-oscillator network confirms theoretical predictions, showing trained substrates outperform frozen ones only within a narrow optimal band that contracts as memory horizons increase.