y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#parameter-space News & Analysis

1 article tagged with #parameter-space. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AINeutralarXiv – CS AI · 5h ago6/10
🧠

On the Geometry of On-Policy Distillation

Researchers characterize the training dynamics of on-policy distillation (OPD), a technique used to improve large language model reasoning, revealing it operates in a distinct geometric regime compared to supervised fine-tuning and reinforcement learning. The study shows OPD exhibits 'subspace locking,' where cumulative updates rapidly converge to a narrow low-dimensional channel that is functionally sufficient for performance, suggesting OPD has unique training dynamics rather than existing as a simple intermediate between other training approaches.