y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#agent-architecture News & Analysis

42 articles tagged with #agent-architecture. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

42 articles
AIBearisharXiv – CS AI · 3d ago7/10
🧠

Relevance as a Vulnerability: How Web Retrieval Degrades Safety Alignment in LLM Agents

Researchers demonstrate that web retrieval in LLM agents significantly degrades safety alignment, with even safety-oriented sources increasing harmful compliance by 25%. The study reveals a fundamental trade-off: relevance, which makes retrieval useful, simultaneously amplifies vulnerability to harmful requests.

AI × CryptoBearisharXiv – CS AI · 3d ago7/10
🤖

Dissociative Identity: Language Model Agents Lack Grounding for Reputation Mechanisms

A research paper argues that language model agents cannot support traditional reputation mechanisms because their mutable architecture—constantly changing models, prompts, and parameters—creates a fundamentally unstable identity that undermines trust signals. The authors propose shifting from identity-based, retroactive governance systems to protocol-based behavioral controls that operate before agents act.

AIBullisharXiv – CS AI · 4d ago7/10
🧠

LACUNA: Safe Agents as Recursive Program Holes

LACUNA is a new programming model that allows LLM agents to write code that shapes their own runtime environment while maintaining safety through type-checking and validation. The system rejects unsafe code before execution and uses compiler diagnostics to drive retries, achieving competitive performance on benchmark tests while preventing prompt injection and tool misuse attacks.

AIBullisharXiv – CS AI · 4d ago7/10
🧠

MemCog: From Memory-as-Tool to Memory-as-Cognition in Conversational Agents

Researchers introduce MemCog, a new memory system for conversational AI agents that integrates memory access into the reasoning process rather than treating it as a separate tool. The system uses associative link graphs and proactive reasoning to enable agents to autonomously explore relevant information, achieving state-of-the-art performance on multiple benchmarks including a newly created ProactiveMemBench.

AIBearisharXiv – CS AI · 5d ago7/10
🧠

RepoMirage: Probing Repository Context Reasoning in Code Agents with Perturbations

Researchers introduce RepoMirage, an evaluation suite that tests whether code agents truly understand repository context by applying perturbations to challenge their reasoning abilities. The study reveals a significant gap in how agents handle complex, multi-file code tasks, with performance dropping from 66.8% to 25.3% when explicit structural understanding is required.

AIBullisharXiv – CS AI · May 127/10
🧠

MIND-Skill: Quality-Guaranteed Skill Generation via Multi-Agent Induction and Deduction

Researchers introduce MIND-Skill, an automated framework that generates reusable skills for LLM-powered AI agents by analyzing successful task trajectories. The system uses dual agents with quality-control mechanisms to create generalizable, documented procedures that enable autonomous systems to handle complex, multi-step problems without manual human expertise.

AINeutralarXiv – CS AI · May 127/10
🧠

MATRA: Modeling the Attack Surface of Agentic AI Systems -- OpenClaw Case Study

Researchers introduce MATRA, a threat modeling framework designed to systematically assess security risks in autonomous AI agent systems. The framework combines asset-based impact analysis with attack trees to quantify how LLM vulnerabilities translate into real-world deployment risks, demonstrating its effectiveness on an OpenClaw personal agent case study.

AINeutralarXiv – CS AI · May 117/10
🧠

Position: Agent Should Invoke External Tools ONLY When Epistemically Necessary

Researchers propose that AI agents should invoke external tools only when epistemically necessary—when internal reasoning cannot reliably complete a task. The Theory of Agent framework treats tool use as a decision under uncertainty rather than a simple action optimization problem, arguing that unnecessary delegation wastes resources and prevents development of internal reasoning capabilities.

AINeutralarXiv – CS AI · May 117/10
🧠

Self-Programmed Execution for Language-Model Agents

Researchers introduce Self-Programmed Execution (SPE), a novel agent architecture where language models act as their own orchestrators rather than following fixed turn-by-turn policies. The approach uses Spell, a Lisp-based language enabling self-editing programs, and demonstrates that frontier models can perform complex agentic tasks without specialized training.

AIBullisharXiv – CS AI · May 97/10
🧠

Belief Memory: Agent Memory Under Partial Observability

Researchers introduce BeliefMem, a novel memory architecture for LLM agents that retains multiple candidate conclusions with associated probabilities instead of committing to single deterministic interpretations. This probabilistic approach preserves uncertainty, allows agents to update confidence as new evidence arrives, and demonstrates superior performance on LoCoMo and ALFWorld benchmarks compared to existing memory methods.

AIBullisharXiv – CS AI · May 17/10
🧠

Building Persona-Based Agents On Demand: Tailoring Multi-Agent Workflows to User Needs

Researchers propose a pipeline for dynamically generating persona-based AI agents at runtime, moving beyond fixed agent architectures to enable personalized multi-agent workflows. This approach allows agentic platforms to adapt agent roles, coordination patterns, and interaction flows to match individual user characteristics and contextual demands, opening new design paradigms for more flexible AI systems.

AIBearisharXiv – CS AI · May 17/10
🧠

Contextual Agentic Memory is a Memo, Not True Memory

Researchers argue that current AI agent memory systems (vector stores, RAG, scratchpads) perform lookup operations rather than true memory consolidation, causing agents to accumulate indefinite notes without developing expertise, hit a generalization ceiling on novel tasks, and remain vulnerable to persistent memory poisoning attacks. The paper draws on neuroscience's Complementary Learning Systems theory to show biological intelligence pairs fast exemplar storage with slow weight consolidation—a dual mechanism current AI systems lack.

AIBullisharXiv – CS AI · Apr 157/10
🧠

Drawing on Memory: Dual-Trace Encoding Improves Cross-Session Recall in LLM Agents

Researchers introduce dual-trace memory encoding for LLM agents, pairing factual records with narrative scene reconstructions to improve cross-session recall by 20+ percentage points. The method significantly enhances temporal reasoning and multi-session knowledge aggregation without increasing computational costs, advancing the capability of persistent AI agent systems.

AINeutralarXiv – CS AI · Apr 147/10
🧠

PAC-BENCH: Evaluating Multi-Agent Collaboration under Privacy Constraints

Researchers introduce PAC-Bench, a benchmark for evaluating how AI agents collaborate while maintaining privacy constraints. The study reveals that privacy protections significantly degrade multi-agent system performance and identify coordination failures as a critical unsolved challenge requiring new technical approaches.

$PAC
AIBullisharXiv – CS AI · Mar 267/10
🧠

From Imperative to Declarative: Towards LLM-friendly OS Interfaces for Boosted Computer-Use Agents

Researchers have developed Declarative Model Interface (DMI), a new abstraction layer that transforms traditional GUIs into LLM-friendly interfaces for computer-use agents. Testing with Microsoft Office Suite showed 67% improvement in task success rates and 43.5% reduction in interaction steps, with over 61% of tasks completed in a single LLM call.

AIBullisharXiv – CS AI · Mar 167/10
🧠

Towards AI Search Paradigm

Researchers introduce the AI Search Paradigm, a comprehensive framework for next-generation search systems using four LLM-powered agents (Master, Planner, Executor, Writer) that collaborate to handle everything from simple queries to complex reasoning tasks. The system employs modular architecture with dynamic workflows for task planning, tool integration, and content synthesis to create more adaptive and scalable AI search capabilities.

AINeutralarXiv – CS AI · 3d ago6/10
🧠

Do Proactive Agents Really Need an LLM to Decide When to Wake and What to Anchor?

Researchers propose replacing LLM-based triggers in proactive agent systems with a lightweight temporal graph learning (TGL) model that processes structured event streams directly. The approach achieves 16.7% mean F1 improvement while running 4-7x faster on GPUs and 12-83x faster on consumer hardware, with a 220 MiB footprint suitable for on-device deployment.

AINeutralarXiv – CS AI · 4d ago6/10
🧠

PEAM: Parametric Embodied Agent Memory through Contrastive Internalization of Experience in Minecraft

Researchers introduce PEAM, a parametric memory framework for AI agents in Minecraft that consolidates learned skills directly into model parameters rather than relying on retrieval-based memory. The system uses a mixture-of-experts architecture with contrastive learning to internalize both successful and failed experiences, achieving better long-horizon task performance while avoiding catastrophic forgetting.

AINeutralarXiv – CS AI · 4d ago6/10
🧠

Rethinking Memory as Continuously Evolving Connectivity

Researchers introduce FluxMem, a memory framework for AI agents that treats memory as a continuously evolving graph rather than a static repository. The system dynamically refines memory connections through feedback and consolidation across three stages, achieving state-of-the-art results on multiple benchmarks.

AINeutralarXiv – CS AI · 4d ago6/10
🧠

Adaptive Multimodal Agents-Based Framework for Automatic Workflow Execution

Researchers propose a novel multimodal multi-agent framework that uses graph-based knowledge construction and adaptive retrieval-augmented generation to enable autonomous agents to execute complex workflows more effectively. The system combines offline discovery of workflow topology from execution logs with real-time collaborative verification, demonstrating improved performance in novel scenarios with limited training data.

AINeutralarXiv – CS AI · 5d ago6/10
🧠

Is Agent Memory a Database? Rethinking Data Foundations for Long-Term AI Agent Memory

Researchers propose Governed Evolving Memory (GEM), a new paradigm for long-term AI agent memory that treats memory as a state-management workload rather than traditional database storage. The framework addresses four critical failure modes in current agent systems—unregulated growth, missing semantic revision, capacity-driven forgetting, and read-only retrieval—through four state-level operators and six correctness conditions that operate at the trajectory level rather than individual records.

AINeutralarXiv – CS AI · May 126/10
🧠

SkillLens: Adaptive Multi-Granularity Skill Reuse for Cost-Efficient LLM Agents

SkillLens introduces a hierarchical framework for organizing and reusing skills in LLM agents at multiple granularity levels, reducing computational costs while maintaining relevance. The system retrieves and adapts skills selectively rather than injecting entire skill blocks, achieving measurable performance gains on benchmark tasks.

AINeutralarXiv – CS AI · May 126/10
🧠

Why Retrying Fails: Context Contamination in LLM Agent Pipelines

Researchers introduce the Context-Contaminated Restart Model (CCRM) to formally analyze why LLM agents fail at higher rates when retrying tasks after errors, showing that failed attempts pollute the context window and increase subsequent error rates 7.1x. The model provides closed-form formulas for success probability, optimal pipeline depth allocation, and quantifies the exact benefit of clearing context before retry attempts.

Page 1 of 2Next →