y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#on-policy-training News & Analysis

2 articles tagged with #on-policy-training. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles
AIBullisharXiv – CS AI · 3h ago6/10
🧠

Data-Efficient On-Policy Distillation for Automatic Speech Recognition

Researchers demonstrate that a 0.6B-parameter ASR model trained on 100k hours of speech can achieve competitive performance with larger models through teacher-guided on-policy distillation, reducing the audio data requirements by 99.5% compared to industry standards while closing the capability gap with 1.7B parameter models.

AINeutralarXiv – CS AI · 3h ago6/10
🧠

Vision-OPD: Learning to See Fine Details for Multimodal LLMs via On-Policy Self-Distillation

Researchers introduce Vision-OPD, a self-distillation framework that improves multimodal large language models' ability to detect fine-grained visual details by training full-image models to match the performance of crop-focused models. The technique achieves competitive results against larger models without requiring external teachers, labels, or inference-time tools, addressing a critical weakness in current MLLMs.