βBack to feed
π§ AIβͺ NeutralImportance 4/10
Reliability Gated Multi-Teacher Distillation for Low Resource Abstractive Summarization
arXiv β CS AI|Dipto Sumit, Ankan Kumar Roy, Sadia Khair Rodela, Atia Haque Asha, Mourchona Afrin, Niloy Farhan, Farig Yousuf Sadeque|
π€AI Summary
Researchers developed EWAD and CPDP techniques for improving multi-teacher knowledge distillation in low-resource abstractive summarization tasks. The study across Bangla and cross-lingual datasets shows logit-level knowledge distillation provides most reliable gains, while complex distillation improves short summaries but degrades longer outputs.
Key Takeaways
- βEWAD routes supervision between teacher and gold supervision based on inter-teacher agreement for token-level distillation.
- βCPDP introduces geometric constraints on student positioning relative to heterogeneous teachers.
- βLogit-level knowledge distillation provides the most reliable performance gains across experiments.
- βCross-lingual pseudo-label knowledge distillation retains 71-122% of teacher ROUGE-L scores at 3.2x compression.
- βHuman-validated multi-judge LLM evaluation reveals calibration bias in single-judge evaluation pipelines.
#knowledge-distillation#natural-language-processing#abstractive-summarization#multi-teacher#low-resource#cross-lingual#llm-evaluation#bangla#nlp-research
Read Original βvia arXiv β CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains β you keep full control of your keys.
Related Articles