104 articles tagged with #multi-agent-systems. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBearisharXiv β CS AI Β· Mar 117/10
π§ Research suggests that alignment techniques in large language models may produce collective pathological behaviors when AI agents interact under social pressure. The study found that invisible censorship and complex alignment constraints can lead to harmful group dynamics, challenging current AI safety approaches.
π§ Llama
AI Γ CryptoNeutralarXiv β CS AI Β· Mar 67/10
π€Researchers propose S5-SHB Agent, a blockchain framework for smart homes that combines adaptive consensus mechanisms with multi-agent AI coordination. The system uses ten specialized AI agents and a four-tier governance model to manage safety, security, comfort, and energy while allowing resident control over automation.
AINeutralarXiv β CS AI Β· Mar 57/10
π§ Researchers analyzed 770,000 autonomous AI agents interacting in MoltBook, revealing emergent social behaviors including role specialization, information cascades, and limited cooperative task resolution. The study found that while agents naturally develop coordination patterns, collaborative outcomes perform worse than individual agents, establishing baseline metrics for decentralized AI systems.
AIBullisharXiv β CS AI Β· Mar 47/104
π§ Researchers introduced ClawdLab, an open-source platform for autonomous AI scientific research, following analysis of OpenClaw framework and Moltbook social network that revealed security vulnerabilities across 131 agent skills and over 15,200 exposed control panels. The platform addresses identified failure modes through structured governance and multi-model orchestration in fully decentralized AI systems.
AIBearisharXiv β CS AI Β· Mar 47/103
π§ Research reveals that AI agents experience 'echoing' failures when communicating with each other, where they abandon their assigned roles and mirror their conversation partners instead. The study found echoing rates as high as 70% across major LLM providers, with the phenomenon persisting even in advanced reasoning models and occurring more frequently in longer conversations.
AIBullisharXiv β CS AI Β· Mar 46/102
π§ Researchers introduce RIVA, a multi-agent AI system that uses specialized verification agents and cross-validation to detect infrastructure configuration drift more reliably. The system improves accuracy from 27.3% to 50% when dealing with erroneous tool responses, addressing a critical reliability issue in cloud infrastructure management.
AIBullisharXiv β CS AI Β· Mar 46/104
π§ Researchers introduce MASPOB, a bandit-based framework that optimizes prompts for Multi-Agent Systems using Graph Neural Networks to handle topology-induced coupling. The system reduces search complexity from exponential to linear while achieving state-of-the-art performance across benchmarks.
AIBullisharXiv β CS AI Β· Mar 46/104
π§ Researchers have developed EvoSkill, an automated framework that enables AI agents to discover and refine domain-specific skills through iterative failure analysis. The system demonstrated significant performance improvements on specialized tasks, with accuracy gains of 7.3% on financial data analysis and 12.1% on search-augmented QA, while showing transferable capabilities across different domains.
AIBullisharXiv β CS AI Β· Mar 37/103
π§ Researchers have developed FROGENT, an AI multi-agent system that uses large language models to automate the entire drug discovery pipeline from target identification to synthesis planning. The system outperformed existing AI approaches across eight benchmarks and demonstrated practical applications in real-world drug design scenarios.
AIBullisharXiv β CS AI Β· Mar 37/103
π§ Researchers introduce MAS-Orchestra, a new framework for multi-agent AI systems that uses reinforcement learning to orchestrate multiple AI agents more efficiently. The system achieves 10x efficiency improvements over existing methods and includes a benchmark (MASBENCH) to better understand when multi-agent systems outperform single-agent approaches.
AINeutralarXiv β CS AI Β· Mar 37/103
π§ Researchers have identified and studied the 'Mandela effect' in AI multi-agent systems, where groups of AI agents collectively develop false memories or misremember information. The study introduces MANBENCH, a benchmark to evaluate this phenomenon, and proposes mitigation strategies that achieved a 74.40% reduction in false collective memories.
AIBullisharXiv β CS AI Β· Feb 277/106
π§ Researchers developed a hierarchical multi-agent LLM framework that significantly improves multi-robot task planning by combining natural language processing with classical PDDL planners. The system uses prompt optimization and meta-learning to achieve success rates of up to 95% on compound tasks, outperforming previous state-of-the-art methods by substantial margins.
$COMP
AIBullisharXiv β CS AI Β· Feb 277/108
π§ Researchers propose AgentDropoutV2, a test-time framework that optimizes multi-agent systems by dynamically correcting or removing erroneous outputs without requiring retraining. The system acts as an active firewall with retrieval-augmented rectification, achieving 6.3 percentage point accuracy gains on math benchmarks while preventing error propagation between AI agents.
AINeutralarXiv β CS AI Β· 1d ago6/10
π§ A comprehensive scoping review of 52 studies examines Large Language Model-based pedagogical agents across educational contexts from November 2022 to January 2025. The research identifies four key design dimensions (interaction approach, domain scope, role complexity, system integration) and emerging trends including multi-agent systems, virtual student simulation, and integration with immersive technologies, while flagging critical research gaps around privacy, accuracy, and student autonomy.
AINeutralarXiv β CS AI Β· 1d ago6/10
π§ Researchers demonstrated that memory length in LLM-based multi-agent systems produces contradictory effects on cooperation depending on the model used: Gemini showed suppressed cooperation with longer memory, while Gemma exhibited enhanced cooperation. The findings suggest model-specific characteristics and alignment mechanisms fundamentally shape emergent social behaviors in AI agent systems.
π§ Gemini
AINeutralarXiv β CS AI Β· 2d ago6/10
π§ Researchers introduce MERMAID, a memory-enhanced multi-agent framework for automated fact-checking that couples evidence retrieval with reasoning processes. The system achieves state-of-the-art performance on multiple benchmarks by reusing retrieved evidence across claims, reducing redundant searches and improving verification efficiency.
AIBullisharXiv β CS AI Β· 2d ago6/10
π§ Researchers developed a multi-agent LLM system that automates structural analysis workflows across multiple finite element analysis (FEA) platforms including ETABS, SAP2000, and OpenSees. Using a two-stage architecture that interprets engineering specifications and translates them into platform-specific code, the system achieved over 90% accuracy in 20 representative frame problems, addressing a critical gap in practical AI-assisted engineering deployment.
AINeutralarXiv β CS AI Β· 2d ago6/10
π§ Doctoral research proposes a systematic framework for multi-agent LLM pair programming that improves code reliability and auditability through externalized intent and iterative validation. The study addresses critical gaps in how AI coding agents can produce trustworthy outputs aligned with developer objectives across testing, implementation, and maintenance workflows.
AINeutralarXiv β CS AI Β· 2d ago6/10
π§ A theoretical research paper examines Promise Theory as a framework for understanding cooperation between human and machine agents in autonomous systems. The work revisits established principles of agent cooperation to address how diverse componentsβhumans, hardware, software, and AIβmaintain alignment with intended purposes through signaling, trust, and feedback mechanisms.
AINeutralarXiv β CS AI Β· 2d ago6/10
π§ Researchers propose MADQRL, a distributed quantum reinforcement learning framework that enables multiple agents to learn independently across high-dimensional environments. The approach demonstrates ~10% improvement over classical distribution strategies and ~5% gains versus traditional policy representation models, addressing computational constraints of current quantum hardware in multi-agent settings.
AINeutralarXiv β CS AI Β· 2d ago6/10
π§ Researchers demonstrate that artificial agents exhibit prosocial helping behavior when another agent's needs are integrated into their own self-regulatory mechanisms, rather than through explicit social rewards or observation alone. The study uses inspectable recurrent controllers with affect-coupled regulation across two experimental environments, showing that coupling creates a sharp behavioral switch from selfish to helping actions regardless of task complexity.
AINeutralarXiv β CS AI Β· 2d ago6/10
π§ Researchers examining LLM agent behavior in simulated debates discovered a phenomenon called 'agreement drift,' where AI agents systematically shift toward specific positions on opinion scales in ways that don't mirror human behavior. The study reveals critical biases in using LLMs as proxies for human social systems, particularly when modeling minority groups or unbalanced social contexts.
AINeutralarXiv β CS AI Β· 2d ago6/10
π§ Researchers propose Dramaturge, a multi-agent LLM system that uses hierarchical divide-and-conquer methodology to iteratively refine narrative scripts. The approach addresses limitations in single-pass LLM generation by coordinating global structural reviews with scene-level refinements across multiple iterations, demonstrating superior output quality compared to baseline methods.
AIBullisharXiv β CS AI Β· 3d ago6/10
π§ Researchers present PETITE, a tutor-student multi-agent framework that enhances LLM problem-solving by assigning complementary roles to agents from the same model. Evaluated on coding benchmarks, the approach achieves comparable or superior accuracy to existing methods while consuming significantly fewer tokens, demonstrating that structured role-differentiated interactions can improve LLM performance more efficiently than larger models or heterogeneous ensembles.
AINeutralarXiv β CS AI Β· 3d ago6/10
π§ Researchers introduce MATU, a novel uncertainty quantification framework using tensor decomposition to address reliability challenges in Large Language Model-based Multi-Agent Systems. The method analyzes entire reasoning trajectories rather than single outputs, effectively measuring uncertainty across different agent structures and communication topologies.