AINeutralarXiv – CS AI · 3h ago6/10
🧠
Multi-Teacher Knowledge Distillation via Teacher-Informed Mixture Priors
Researchers introduce Multi-Teacher Bayesian Knowledge Distillation (MT-BKD), a framework that enables student models to learn from multiple teacher models while quantifying uncertainty through Bayesian inference. The approach uses teacher-informed priors and entropy-based weighting to improve model compression, generalization, and interpretability across synthetic and real-world tasks.