#agent-design News & Analysis

14 articles tagged with #agent-design. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

14 articles

AIBullisharXiv – CS AI · Jun 257/10

🧠

Skill-MAS: Evolving Meta-Skill for Automatic Multi-Agent Systems

Skill-MAS introduces a novel framework that enhances multi-agent AI systems by evolving meta-skills through a closed optimization loop, achieving significant performance gains while maintaining cost efficiency across diverse LLMs and tasks.

AIBullisharXiv – CS AI · May 127/10

🧠

The Agent Use of Agent Beings: Agent Cybernetics Is the Missing Science of Foundation Agents

Researchers propose Agent Cybernetics, a theoretical framework applying mid-20th century control systems theory to modern LLM-based AI agents. The framework addresses critical gaps in how foundation agents are designed, offering scientific principles for reliability, continuous operation, and safe self-improvement across long-horizon tasks.

AIBullisharXiv – CS AI · May 117/10

🧠

From Storage to Experience: A Survey on the Evolution of LLM Agent Memory Mechanisms

Researchers propose a unified evolutionary framework for LLM agent memory systems, categorizing development into three stages: Storage, Reflection, and Experience. The framework addresses fragmented research by synthesizing engineering and cognitive science perspectives, offering design principles for building more capable autonomous AI agents.

AIBullisharXiv – CS AI · Jun 236/10

🧠

How Should Agents Read Demonstrations? Hierarchical Structure Beats Flat Action Logs

A research paper demonstrates that organizing demonstration data hierarchically into labeled subgoals significantly improves LLM agent performance on ambiguous tasks, achieving 90.7% pass rates versus 76.7% for flat action logs. This finding provides concrete design guidance for Programming by Demonstration systems and broader procedural knowledge transfer to AI agents.

AINeutralarXiv – CS AI · Jun 236/10

🧠

Turning Intent into Specifications: A Benchmark and an Interactive User-Assistant Agent

Researchers introduce SpecBench, a benchmark for evaluating AI agents' ability to translate vague user intent into structured specifications through interactive collaboration. They propose Buddy, an agent that decomposes user requirements into design dimensions, simulates user preferences, and strategically engages users to resolve ambiguities—shifting focus from code generation to specification clarity.

AINeutralarXiv – CS AI · Jun 106/10

🧠

How can we assess human-agent interactions? Case studies in software agent design

Researchers propose PULSE, a framework for evaluating human-agent interactions in software engineering rather than relying solely on automated benchmarks. The framework combines human feedback with machine learning predictions to assess user satisfaction, revealing significant gaps between benchmark performance and real-world agent effectiveness across 15,000 users.

🧠 GPT-5

AINeutralarXiv – CS AI · Jun 56/10

🧠

A Motivational Architecture for Conversational AGI

Researchers propose a conversational motivational architecture for AGI systems that reinterprets traditional cognitive AI frameworks for dialogue-based agents. Rather than regulating bodily needs, the system manages competence, uncertainty, affiliation, and aesthetic coherence through a ten-stage processing pipeline that separates emotional appraisal from decision-making.

AINeutralarXiv – CS AI · May 285/10

🧠

From Instructor to Collaborator: What a 90-Participant Study Reveals about Human-Agent Collaboration in a Mobile Serious Game

A PhD study of 90 participants compared human-like spoken embodied conversational agents versus text-based agents in a mobile educational game about UK currency. Results showed statistically significant user preference for highly human-like agents, with implications for designing collaborative human-agent systems in educational contexts.

AIBullisharXiv – CS AI · May 96/10

🧠

BALAR : A Bayesian Agentic Loop for Active Reasoning

Researchers introduced BALAR, a Bayesian algorithm that enables large language models to engage in structured multi-turn dialogue by actively reasoning about missing information and strategically asking clarifying questions. The system demonstrated significant performance improvements across three diverse benchmarks—14.6% to 38.5% higher accuracy—without requiring fine-tuning, suggesting a more principled approach to interactive AI reasoning.

AINeutralarXiv – CS AI · May 96/10

🧠

More Is Not Always Better: Cross-Component Interference in LLM Agent Scaffolding

Researchers demonstrate that stacking more components into LLM agent systems doesn't improve performance and often degrades it due to cross-component interference. A comprehensive factorial study across 32 configurations shows optimal agent design is task-dependent and model-scale dependent, with the fully-equipped system consistently underperforming smaller, curated subsets by up to 79%.

🧠 Llama

AINeutralarXiv – CS AI · May 46/10

🧠

The $\textit{Silicon Society}$ Cookbook: Design Space of LLM-based Social Simulations

Researchers systematically analyze the design space of LLM-based social simulations, examining how different architectural choices—particularly base model selection and network topology—affect simulated agent behavior and opinion formation. The study reveals non-trivial interactions between parameters and identifies the choice of underlying LLM as the most critical factor determining simulation outcomes.

AINeutralarXiv – CS AI · Apr 146/10

🧠

The Missing Knowledge Layer in Cognitive Architectures for AI Agents

Researchers identify a critical architectural gap in leading AI agent frameworks (CoALA and JEPA), which lack an explicit Knowledge layer with distinct persistence semantics. The paper proposes a four-layer decomposition model with fundamentally different update mechanics for knowledge, memory, wisdom, and intelligence, with working implementations demonstrating feasibility.

AINeutralarXiv – CS AI · Apr 136/10

🧠

Artifacts as Memory Beyond the Agent Boundary

Researchers formalize how agents can use environmental artifacts as external memory to reduce computational requirements in reinforcement learning tasks. The study demonstrates that spatial observations can implicitly serve as memory substitutes, allowing agents to learn effective policies with less internal memory capacity than previously thought necessary.

AIBullishMarkTechPost · Mar 116/10

🧠

How to Build a Self-Designing Meta-Agent That Automatically Constructs, Instantiates, and Refines Task-Specific AI Agents

This tutorial demonstrates building a Meta-Agent system that automatically designs and instantiates task-specific AI agents from simple descriptions. The system dynamically analyzes tasks, selects appropriate tools, configures memory architecture and planners, then creates fully functional agent runtimes without relying on static templates.