y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#distillation News & Analysis

8 articles tagged with #distillation. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

8 articles
AIBullisharXiv โ€“ CS AI ยท Mar 267/10
๐Ÿง 

HDPO: Hybrid Distillation Policy Optimization via Privileged Self-Distillation

Researchers introduce Hybrid Distillation Policy Optimization (HDPO), a new method that improves large language model training for mathematical reasoning by addressing 'cliff prompts' where standard reinforcement learning fails. The technique uses privileged self-distillation to provide learning signals for previously unsolvable problems, showing measurable improvements in coverage metrics while maintaining accuracy.

AIBullisharXiv โ€“ CS AI ยท Mar 177/10
๐Ÿง 

Masked Auto-Regressive Variational Acceleration: Fast Inference Makes Practical Reinforcement Learning

Researchers introduce MARVAL, a distillation framework that accelerates masked auto-regressive diffusion models by compressing inference into a single step while enabling practical reinforcement learning applications. The method achieves 30x speedup on ImageNet with comparable quality, making RL post-training feasible for the first time with these models.

AIBullisharXiv โ€“ CS AI ยท Apr 76/10
๐Ÿง 

DP-OPD: Differentially Private On-Policy Distillation for Language Models

Researchers have developed DP-OPD (Differentially Private On-Policy Distillation), a new framework for training privacy-preserving language models that significantly improves performance over existing methods. The approach simplifies the training pipeline by eliminating the need for DP teacher training and offline synthetic text generation while maintaining strong privacy guarantees.

๐Ÿข Perplexity
AIBullisharXiv โ€“ CS AI ยท Mar 276/10
๐Ÿง 

X-OPD: Cross-Modal On-Policy Distillation for Capability Alignment in Speech LLMs

Researchers propose X-OPD, a Cross-Modal On-Policy Distillation framework to improve Speech Large Language Models by aligning them with text-based counterparts. The method uses token-level feedback from teacher models to bridge performance gaps in end-to-end speech systems while preserving inherent capabilities.

AIBullisharXiv โ€“ CS AI ยท Mar 27/1024
๐Ÿง 

DUET: Distilled LLM Unlearning from an Efficiently Contextualized Teacher

Researchers propose DUET, a new distillation-based method for LLM unlearning that removes undesirable knowledge from AI models without full retraining. The technique combines computational efficiency with security advantages, achieving better performance in both knowledge removal and utility preservation while being significantly more data-efficient than existing methods.

AIBullisharXiv โ€“ CS AI ยท Mar 27/1022
๐Ÿง 

Embodiment-Aware Generalist Specialist Distillation for Unified Humanoid Whole-Body Control

Researchers introduce EAGLE, a reinforcement learning framework that creates unified control policies for multiple different humanoid robots without per-robot tuning. The system uses iterative generalist-specialist distillation to enable a single AI controller to manage diverse humanoid embodiments and support complex behaviors beyond basic walking.

AINeutralarXiv โ€“ CS AI ยท Mar 34/103
๐Ÿง 

DistillKac: Few-Step Image Generation via Damped Wave Equations

DistillKac introduces a new fast image generation method using damped wave equations and Kac representation for finite-speed probability transport. Unlike diffusion models with potentially unstable reverse-time velocities, this approach enforces bounded kinetic energy and offers improved numerical stability with fewer function evaluations.

AINeutralLil'Log (Lilian Weng) ยท Jan 105/10
๐Ÿง 

Large Transformer Model Inference Optimization

Large transformer models face significant inference optimization challenges due to high computational costs and memory requirements. The article discusses technical factors contributing to inference bottlenecks that limit real-world deployment at scale.