y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#agent-architecture News & Analysis

42 articles tagged with #agent-architecture. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

42 articles
AINeutralarXiv – CS AI · May 126/10
🧠

MCP-Cosmos: World Model-Augmented Agents for Complex Task Execution in MCP Environments

Researchers present MCP-Cosmos, a framework integrating World Models into the Model Context Protocol ecosystem to enhance LLM agent planning and execution. The approach demonstrates measurable improvements in tool success rates and parameter accuracy across multiple benchmark tasks by enabling agents to simulate outcomes before taking actions.

AIBullisharXiv – CS AI · May 116/10
🧠

Group of Skills: Group-Structured Skill Retrieval for Agent Skill Libraries

Researchers introduce Group of Skills (GoSkills), a new method for organizing and retrieving skills in AI agent libraries that presents skills as structured execution contexts rather than flat lists. The approach improves agent performance on benchmark tasks while maintaining efficiency and doesn't require changes to existing agent systems.

AINeutralarXiv – CS AI · May 116/10
🧠

Region4Web: Rethinking Observation Space Granularity for Web Agents

Region4Web introduces a novel framework that reorganizes how AI web agents perceive and process web pages by shifting from element-level to functional region-level observation granularity. The approach, validated on WebArena benchmark, reduces observation length while improving task success rates across multiple LLM models, demonstrating that hierarchical abstraction of page structure yields more efficient agent performance.

AINeutralarXiv – CS AI · May 96/10
🧠

An Agent-Oriented Pluggable Experience-RAG Skill for Experience-Driven Retrieval Strategy Orchestration

Researchers present Experience-RAG Skill, an agent-oriented system that dynamically selects optimal retrieval strategies based on task context, rather than using a single fixed pipeline. The system achieves competitive performance across diverse question-answering tasks by leveraging experience memory to orchestrate retrieval, demonstrating that strategy selection can be implemented as a reusable agent component.

AINeutralarXiv – CS AI · May 96/10
🧠

MEMSAD: Gradient-Coupled Anomaly Detection for Memory Poisoning in Retrieval-Augmented Agents

Researchers present MEMSAD, a defense mechanism against memory poisoning attacks on retrieval-augmented LLM agents, using gradient-coupled anomaly detection to identify adversarial perturbations while maintaining retrieval performance. The work formalizes security vulnerabilities in persistent external memory systems and demonstrates that while composite defenses achieve perfect detection rates, synonym-based attacks remain undetectable by embedding-based approaches.

AINeutralarXiv – CS AI · Apr 206/10
🧠

Experience Compression Spectrum: Unifying Memory, Skills, and Rules in LLM Agents

Researchers propose the Experience Compression Spectrum, a unifying framework that reconciles two separate research communities studying LLM agent memory and skill discovery by positioning them along a single compression axis. The framework identifies a critical gap—no existing system supports adaptive cross-level compression—and reveals that memory systems and skill discovery communities operate in isolation despite solving overlapping problems.

AINeutralarXiv – CS AI · Apr 206/10
🧠

Integrating Graphs, Large Language Models, and Agents: Reasoning and Retrieval

A comprehensive survey examines how Large Language Models can be effectively integrated with graph-based data structures to improve reasoning, retrieval, and decision-making across domains. The research categorizes integration approaches by purpose, graph type, and strategy, providing practitioners with guidance on selecting appropriate techniques for specific applications in healthcare, finance, robotics, and other fields.

AINeutralarXiv – CS AI · Apr 206/10
🧠

GTA-2: Benchmarking General Tool Agents from Atomic Tool-Use to Open-Ended Workflows

Researchers introduce GTA-2, a hierarchical benchmark that evaluates AI agents on both atomic tool-use tasks and complex, open-ended workflows using real user queries and deployed tools. The study reveals a significant capability cliff where frontier AI models achieve below 50% success on atomic tasks and only 14.39% on realistic workflows, highlighting that execution framework design matters as much as underlying model capacity.

AINeutralarXiv – CS AI · Apr 156/10
🧠

Self-Monitoring Benefits from Structural Integration: Lessons from Metacognition in Continuous-Time Multi-Timescale Agents

Researchers investigated whether self-monitoring mechanisms (metacognition, self-prediction, duration estimation) improve reinforcement learning agents in predator-prey environments. Initial auxiliary-loss implementations provided no benefits, but structurally integrating these modules into decision pathways showed modest improvements, suggesting effective AI enhancement requires architectural embedding rather than add-on approaches.

AINeutralarXiv – CS AI · Apr 156/10
🧠

Identity as Attractor: Geometric Evidence for Persistent Agent Architecture in LLM Activation Space

Researchers demonstrate that large language models develop attractor-like geometric patterns in their activation space when processing identity documents describing persistent agents. Experiments on Llama 3.1 and Gemma 2 show paraphrased identity descriptions cluster significantly tighter than structural controls, suggesting LLMs encode semantic agent identity as stable attractors independent of linguistic variation.

🧠 Llama
AIBullisharXiv – CS AI · Apr 156/10
🧠

M$^\star$: Every Task Deserves Its Own Memory Harness

Researchers introduce M★, a method that automatically evolves task-specific memory systems for large language model agents by treating memory architecture as executable Python code. The approach outperforms fixed memory designs across conversation, planning, and reasoning benchmarks, suggesting that specialized memory mechanisms significantly outperform one-size-fits-all solutions.

AINeutralarXiv – CS AI · Apr 146/10
🧠

From Agent Loops to Structured Graphs:A Scheduler-Theoretic Framework for LLM Agent Execution

Researchers propose SGH (Structured Graph Harness), a framework that replaces iterative Agent Loops with explicit directed acyclic graphs (DAGs) for LLM agent execution. The approach addresses structural weaknesses in current agent design by enforcing immutable execution plans, separating planning from recovery, and implementing strict escalation protocols, trading some flexibility for improved controllability and verifiability.

AINeutralarXiv – CS AI · Apr 106/10
🧠

How Much LLM Does a Self-Revising Agent Actually Need?

Researchers introduce a declarative runtime protocol that externalizes agent state to measure how much of an LLM-based agent's competence actually derives from the language model versus explicit structural components. Testing on Collaborative Battleship, they find that explicit world-model planning drives most performance gains, while sparse LLM-based revision at 4.3% of turns yields minimal and sometimes negative returns.

AIBullisharXiv – CS AI · Mar 166/10
🧠

AI Planning Framework for LLM-Based Web Agents

Researchers introduce a formal planning framework that maps LLM-based web agents to traditional search algorithms, enabling better diagnosis of failures in autonomous web tasks. The study compares different agent architectures using novel evaluation metrics and a dataset of 794 human-labeled trajectories from WebArena benchmark.

AINeutralarXiv – CS AI · Mar 96/10
🧠

ESAA-Security: An Event-Sourced, Verifiable Architecture for Agent-Assisted Security Audits of AI-Generated Code

Researchers have developed ESAA-Security, a new architecture for conducting secure, verifiable audits of AI-generated code using structured agent workflows rather than unstructured LLM conversations. The system creates an immutable audit trail through event-sourcing and produces comprehensive security reports across 26 tasks and 95 executable checks.

AIBullisharXiv – CS AI · Mar 36/1012
🧠

Graph-Based Self-Healing Tool Routing for Cost-Efficient LLM Agents

Researchers developed Self-Healing Router, a fault-tolerant system for LLM agents that reduces control-plane LLM calls by 93% while maintaining correctness. The system uses graph-based routing with automatic recovery mechanisms, treating agent decisions as routing problems rather than reasoning tasks.

$COMP
← PrevPage 2 of 2