#kl-divergence News & Analysis

6 articles tagged with #kl-divergence. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

6 articles

AINeutralarXiv – CS AI · Jun 56/10

🧠

Extreme Region Policy Distillation

Researchers propose Extreme Region Policy Distillation (ERPD), a two-stage framework that improves reinforcement learning efficiency for large language models by first extracting maximum training signals through aggressive off-policy optimization, then distilling those signals into a base policy with tighter constraints. The approach achieves comparable or better performance with significantly reduced KL divergence, addressing a fundamental trade-off between sample efficiency and asymptotic performance in LLM training.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Rethinking the Role of Temperature in Large Language Model Distillation

Researchers demonstrate that temperature scaling fundamentally alters the performance comparison between forward KL and reverse KL divergence in LLM distillation, revealing that forward KL substantially outperforms reverse KL at higher temperatures by better leveraging non-dominant token signals. This finding challenges the prevailing preference for reverse KL and suggests that temperature optimization enables simple KL-based methods to match state-of-the-art distillation approaches.

AINeutralarXiv – CS AI · Jun 16/10

🧠

Unlearning in Diffusion Models: A Unified Framework with KL Divergence and Likelihood Constraints

Researchers propose a constrained optimization framework for unlearning in diffusion models that balances removing undesirable data while preserving model utility. Using KL divergence and likelihood constraints with primal-dual algorithms, the approach achieves superior performance in concept and data unlearning compared to existing weight-based methods.

AINeutralarXiv – CS AI · May 296/10

🧠

KLAS: Using Similarity to Stitch Neural Networks for Improved Accuracy-Efficiency Tradeoffs

KLAS is a new framework that automates the selection of neural network stitching configurations by using KL divergence to measure similarity between pretrained models, enabling better accuracy-efficiency tradeoffs. The approach improves upon existing heuristic-based methods and achieves up to 1.21% higher accuracy on ImageNet-1K at equivalent computational cost, or reduces computational requirements by 1.33x while maintaining performance.

AINeutralarXiv – CS AI · May 285/10

🧠

An Empirical Audit of k-NAF Budget Accounting for Anchored Decoding

Researchers empirically tested the k-NAF budget accounting mechanism in Anchored Decoding across 8,500 executions and found that cumulative KL divergence spending remained consistently below sequence-level budgets, with no clear evidence of budget exhaustion even under adaptive stress testing. Results suggest the budget mechanism functions reliably, though some proxy artifacts appeared in small-sample evaluations on copyright-domain workloads.

AINeutralarXiv – CS AI · May 96/10

🧠

CRAFT: Forgetting-Aware Intervention-Based Adaptation for Continual Learning

Researchers introduce CRAFT, a continual learning framework for large language models that prevents catastrophic forgetting by learning low-rank interventions on hidden representations rather than updating model weights. The three-stage approach uses KL divergence-based routing and merging to enable models to acquire new capabilities while maintaining performance on previously learned tasks.