AIBullisharXiv โ CS AI ยท 14h ago6/10
๐ง
Skill-SD: Skill-Conditioned Self-Distillation for Multi-turn LLM Agents
Researchers introduce Skill-SD, a novel training framework for multi-turn LLM agents that improves sample efficiency by converting successful agent trajectories into dynamic natural language skills that condition a teacher model. The approach combines reinforcement learning with self-distillation and achieves significant performance improvements over baseline methods on benchmark tasks.