SkillAdaptor: Self-Adapting Skills for LLM Agents from Trajectories
SkillAdaptor introduces a training-free framework for refining external skills used by LLM agents, using step-level failure attribution instead of trajectory-level feedback. The method demonstrates consistent improvements across three evaluation benchmarks (WebShop, PinchBench, Claw-Eval) with gains up to 1.8 points, offering more stable and auditable skill maintenance for autonomous agent systems.
SkillAdaptor addresses a critical challenge in autonomous LLM agent development: how to iteratively improve external skills without expensive retraining. Existing approaches update skills based on entire failed trajectories, making it difficult to pinpoint which specific action caused a failure and resulting in overly broad corrections that can degrade overall performance. This new framework narrows the scope by identifying the first actionable fault step within a trajectory, then attributing responsibility to candidate skills and applying targeted updates with explicit acceptance checks.
The advancement reflects broader industry efforts to make LLM agents more reliable and practical for real-world deployment. As agents tackle increasingly complex, long-horizon tasks in domains like web navigation and tool use, the ability to quickly diagnose and fix skill failures becomes essential. Current approaches rely on either no skill adaptation or coarser methods, leaving room for improvement in both stability and transparency.
The experimental validation across multiple benchmarks and language models (Kimi-K2.5, GLM-5, GPT-5.2) demonstrates consistent gains, with improvements ranging from 1.5 to 1.8 points depending on the metric. These results suggest step-level attribution provides more granular feedback than session-level approaches, reducing over-correction and maintaining agent robustness. The framework's compatibility with existing agent architectures (OpenClaw-class harnesses) enhances its practical applicability.
For the AI development community, this work signals growing sophistication in autonomous agent engineering. Future iterations may combine step-level attribution with more sophisticated learning mechanisms or extend the framework to handle skill composition and complex failure chains.
- βSkillAdaptor uses step-level failure attribution to identify and fix specific skill failures rather than revising skills based on entire failed trajectories
- βThe framework improves LLM agent performance by 1.5-1.8 points across WebShop, PinchBench, and Claw-Eval benchmarks without requiring model retraining
- βTargeted skill updates with explicit acceptance checks maintain backbone model stability while enabling more auditable skill maintenance
- βThe approach is compatible with existing OpenClaw-class agent architectures, enabling practical integration into current deployment systems
- βStep-level attribution reduces overly broad skill revisions that can degrade agent performance, improving both accuracy and reliability