AIBullisharXiv โ CS AI ยท Feb 276/106
๐ง
UpSkill: Mutual Information Skill Learning for Structured Response Diversity in LLMs
Researchers introduce UpSkill, a new training method that uses Mutual Information Skill Learning to improve large language models' ability to generate diverse correct responses across multiple attempts. The technique shows ~3% improvements in pass@k metrics on mathematical reasoning tasks using models like Llama 3.1-8B and Qwen 2.5-7B without degrading single-attempt accuracy.