AIBullisharXiv – CS AI · Jun 97/10
🧠Researchers introduce MemToolAgent, a framework that enhances LLM agents' ability to use tools effectively by implementing memory management systems that store and retrieve past experiences. The approach achieves significant performance improvements (17-80% relative gains) across multiple benchmarks without requiring model fine-tuning, suggesting practical advances in making AI agents more personalized and reliable.
AIBullisharXiv – CS AI · Jun 27/10
🧠Researchers introduce JAMEL, a framework that trains AI agents to explore open-ended environments more effectively by jointly developing memory systems and exploration policies through novelty-driven learning. The approach uses natural supervisory signals like code coverage to train compressed memory representations, achieving exploration capabilities that rival closed-source models while reducing computational token consumption.
AIBullisharXiv – CS AI · Jun 27/10
🧠Researchers introduce COMAP, a framework that enables language model agents to improve through co-evolution of world models and policies via closed-loop interaction, eliminating the need for external rewards. The approach achieves significant performance gains across multiple benchmarks, demonstrating that self-improving AI agents can adapt their internal representations to match their evolving behavior patterns.
AIBullisharXiv – CS AI · May 117/10
🧠Researchers introduce EvolveR, a framework enabling LLM agents to self-improve through a closed-loop lifecycle combining offline strategy distillation with online task interaction. The system demonstrates superior performance on complex question-answering benchmarks by enabling agents to learn from their own experiences rather than relying solely on external knowledge.
AINeutralarXiv – CS AI · Jun 46/10
🧠Trivium introduces a framework for AI agents that tracks temporal regret—how long errors persist—alongside outcome and epistemic regret to improve long-term learning. The research demonstrates that outcome-only optimization fails to correct systematic causal misunderstandings, and proposes a logarithmic-complexity intervention strategy that achieves O(log E) temporal regret across episode horizons.
AIBullisharXiv – CS AI · May 276/10
🧠Researchers introduce CyberEvolver, an AI agent framework that autonomously improves its own architecture through iterative learning from failed cybersecurity tasks. The system demonstrates 13.6% average success rate improvements across CTF challenges and penetration testing, outperforming fixed human-designed alternatives and competing self-improvement methods.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers present a theoretical framework for inferring the preferences and reward functions of learning agents through observation, extending inverse reinforcement learning beyond its traditional assumption that observed agents act optimally. The work establishes mathematical guarantees for preference learning algorithms when agents are either no-regret learners or converge to optimal Boltzmann policies.
AIBullisharXiv – CS AI · May 116/10
🧠Researchers propose RRCM, a novel framework that enhances Large Language Model-based recommendation systems by dynamically retrieving relevant collaborative and metadata information. The system learns optimal context construction through ranking-driven optimization, addressing key challenges in balancing context quality with efficiency limitations.