AIBullisharXiv – CS AI · Jun 97/10
🧠Researchers introduce SkeMex, a self-evolving skill-based memory framework that enables medical AI agents to improve after deployment without retraining model weights. The system distills clinical interaction trajectories into reusable procedural skills, organized across multiple memory branches, and uses environment feedback to determine which experiences are genuinely useful for future decision-making.
AINeutralarXiv – CS AI · May 127/10
🧠Researchers introduce SkillMaster, a training framework that enables LLM agents to autonomously create, refine, and select skills during task execution rather than relying on external supervision. The system demonstrates 8.8-9.3% performance improvements over existing baselines on complex agent benchmarks, representing a significant step toward self-improving AI agents.
AIBullisharXiv – CS AI · Apr 77/10
🧠Researchers introduce SkillX, an automated framework for building reusable skill knowledge bases for AI agents that addresses inefficiencies in current self-evolving paradigms. The system uses multi-level skill design, iterative refinement, and exploratory expansion to create plug-and-play skill libraries that improve task success and execution efficiency across different agents and environments.
AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers introduce PolySkill, a framework that enables AI agents to learn generalizable skills by separating abstract goals from concrete implementations, inspired by software engineering polymorphism. The method improves skill reuse by 1.7x and boosts success rates by up to 13.9% on web navigation tasks while reducing execution steps by over 20%.
AINeutralarXiv – CS AI · 3d ago6/10
🧠Researchers introduce RATs (Robotics Agent Teams), an agentic robot learning system that uses self-directed play to acquire reusable skills before receiving downstream tasks. The approach demonstrates significant performance improvements on robotics benchmarks and enables learned skills to transfer across different agents without finetuning.
AIBullisharXiv – CS AI · Jun 86/10
🧠Researchers introduce W2S, a framework for automatically constructing high-quality skills for large language model agents by decomposing execution traces into workflow structures, semantics, and attachments. The approach outperforms traditional summarization methods by 10.5%, demonstrating that treating traces as executable specifications rather than text yields more reliable agent behavior.
AIBullisharXiv – CS AI · Jun 46/10
🧠Researchers introduce State-Grounded Dynamic Retrieval (SGDR), a new method enabling language agents to dynamically reuse learned skills during web automation tasks. By matching skills to both task goals and current webpage states rather than fixed skill sets, SGDR achieves 10.6% relative performance gains over existing approaches on complex multi-step web tasks.
🧠 GPT-4
AINeutralarXiv – CS AI · Jun 26/10
🧠Researchers introduce ReSkill, an RL-in-the-loop framework that improves how AI agents create and refine reusable skills during policy learning. The method synchronizes skill evolution with policy optimization, enabling agents to automatically develop, test, and prune strategies that generalize across tasks more effectively than existing approaches.
🏢 Anthropic
AIBullisharXiv – CS AI · Jun 26/10
🧠Researchers introduce SIRI, a three-phase reinforcement learning framework that enables LLM agents to autonomously discover, validate, and internalize reusable skills without external skill generators or inference-time skill banks. Testing on ALFWorld and WebShop benchmarks shows meaningful performance improvements over baseline methods while reducing deployment complexity and latency.
AINeutralarXiv – CS AI · Jun 26/10
🧠Researchers introduce MMG2Skill, a framework that converts unstructured web guides into executable skills for AI agents, with a new benchmark for evaluation. The system improves agent performance by 12.8-25.3 percentage points across multiple domains by structuring knowledge, conditioning vision-language models on refined skills, and iteratively improving them from agent trajectories.
AIBullisharXiv – CS AI · May 286/10
🧠Researchers propose Skill-Conditioned Gated Self-Distillation (SGSD), a novel method for improving large language model reasoning by leveraging an experience-derived skill bank rather than trusted reference answers. The approach validates skills through a multi-teacher framework and demonstrates consistent improvements over existing methods on mathematical reasoning benchmarks.
AIBullisharXiv – CS AI · May 126/10
🧠SearchSkill is a new framework that teaches language models to perform more effective web searches by explicitly planning queries through reusable skill cards rather than treating search as an undifferentiated action. The system maintains an evolving skill bank that improves from failure patterns, demonstrating better performance on knowledge-intensive QA tasks with fewer wasted queries and improved reasoning accuracy.
AIBullisharXiv – CS AI · May 126/10
🧠EmbodiSkill introduces a training-free framework enabling embodied AI agents to autonomously improve their skills through reflection on task execution trajectories. By distinguishing between skill deficiencies and execution lapses, the system allows frozen language models to achieve significantly higher task success rates, with a Qwen 3.5-27B model reaching 93.28% success on ALFWorld benchmarks.
🧠 GPT-5
AINeutralarXiv – CS AI · May 96/10
🧠Skill1 presents a unified reinforcement learning framework that enables language model agents to co-evolve three coupled capabilities: skill selection, utilization, and distillation from a single task-outcome reward signal. Demonstrated improvements over existing baselines on complex tasks suggest advances in how AI agents can build and leverage persistent skill libraries across diverse problem domains.
AIBullisharXiv – CS AI · Apr 146/10
🧠Researchers introduce Skill-SD, a novel training framework for multi-turn LLM agents that improves sample efficiency by converting successful agent trajectories into dynamic natural language skills that condition a teacher model. The approach combines reinforcement learning with self-distillation and achieves significant performance improvements over baseline methods on benchmark tasks.