#agent-learning News & Analysis

8 articles tagged with #agent-learning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

8 articles

AIBullisharXiv – CS AI · Jun 97/10

🧠

MemToolAgent overview with a simple restaurant booking scenario where the agent retrieves similar memories, receives feedback on an invalid time format, and generates a reflection to update its memory

Researchers introduce MemToolAgent, a framework that enhances LLM agents' ability to use tools effectively by implementing memory management systems that store and retrieve past experiences. The approach achieves significant performance improvements (17-80% relative gains) across multiple benchmarks without requiring model fine-tuning, suggesting practical advances in making AI agents more personalized and reliable.

AIBullisharXiv – CS AI · Jun 27/10

🧠

Joint Agent Memory and Exploration Learning via Novelty Signals

Researchers introduce JAMEL, a framework that trains AI agents to explore open-ended environments more effectively by jointly developing memory systems and exploration policies through novelty-driven learning. The approach uses natural supervisory signals like code coverage to train compressed memory representations, achieving exploration capabilities that rival closed-source models while reducing computational token consumption.

AIBullisharXiv – CS AI · Jun 27/10

🧠

COMAP: Co-Evolving World Models and Agent Policies for LLM Agents

Researchers introduce COMAP, a framework that enables language model agents to improve through co-evolution of world models and policies via closed-loop interaction, eliminating the need for external rewards. The approach achieves significant performance gains across multiple benchmarks, demonstrating that self-improving AI agents can adapt their internal representations to match their evolving behavior patterns.

AIBullisharXiv – CS AI · May 117/10

🧠

EvolveR: Self-Evolving LLM Agents through an Experience-Driven Lifecycle

Researchers introduce EvolveR, a framework enabling LLM agents to self-improve through a closed-loop lifecycle combining offline strategy distillation with online task interaction. The system demonstrates superior performance on complex question-answering benchmarks by enabling agents to learn from their own experiences rather than relying solely on external knowledge.

AINeutralarXiv – CS AI · Jun 46/10

🧠

Trivium: Temporal Regret as a First-Class Objective for Causal-Memory Controllers

Trivium introduces a framework for AI agents that tracks temporal regret—how long errors persist—alongside outcome and epistemic regret to improve long-term learning. The research demonstrates that outcome-only optimization fails to correct systematic causal misunderstandings, and proposes a logarithmic-complexity intervention strategy that achieves O(log E) temporal regret across episode horizons.

AIBullisharXiv – CS AI · May 276/10

🧠

CyberEvolver: Structured Self-Evolution for Cybersecurity Agents On the Fly

Researchers introduce CyberEvolver, an AI agent framework that autonomously improves its own architecture through iterative learning from failed cybersecurity tasks. The system demonstrates 13.6% average success rate improvements across CTF challenges and penetration testing, outperforming fixed human-designed alternatives and competing self-improvement methods.

AINeutralarXiv – CS AI · May 126/10

🧠

Learning the Preferences of a Learning Agent

Researchers present a theoretical framework for inferring the preferences and reward functions of learning agents through observation, extending inverse reinforcement learning beyond its traditional assumption that observed agents act optimally. The work establishes mathematical guarantees for preference learning algorithms when agents are either no-regret learners or converge to optimal Boltzmann policies.

AIBullisharXiv – CS AI · May 116/10

🧠

RRCM: Ranking-Driven Retrieval over Collaborative and Meta Memories for LLM Recommendation

Researchers propose RRCM, a novel framework that enhances Large Language Model-based recommendation systems by dynamically retrieving relevant collaborative and metadata information. The system learns optimal context construction through ranking-driven optimization, addressing key challenges in balancing context quality with efficiency limitations.