AIBullisharXiv – CS AI · Apr 146/10
🧠
Skill-SD: Skill-Conditioned Self-Distillation for Multi-turn LLM Agents
Researchers introduce Skill-SD, a novel training framework for multi-turn LLM agents that improves sample efficiency by converting successful agent trajectories into dynamic natural language skills that condition a teacher model. The approach combines reinforcement learning with self-distillation and achieves significant performance improvements over baseline methods on benchmark tasks.