85 articles tagged with #multi-agent. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBullisharXiv – CS AI · Mar 116/10
🧠Researchers introduce AutoAgent, a self-evolving multi-agent framework that combines evolving cognition, contextual decision-making, and elastic memory orchestration to enable adaptive autonomous agents. The system continuously learns from experience without external retraining and shows improved performance across retrieval, tool-use, and collaborative tasks compared to static baselines.
AINeutralarXiv – CS AI · Mar 116/10
🧠Researchers propose a framework using policy-parameterized prompts to influence multi-agent LLM dialogue behavior without training. The approach treats prompts as actions and dynamically constructs them through five components to control conversation flow based on metrics like responsiveness and stance shift.
AIBullisharXiv – CS AI · Mar 116/10
🧠Researchers introduce SiliconMind-V1, a new multi-agent AI framework that generates Verilog hardware code with improved functional correctness. The system uses locally fine-tuned language models with integrated testing and debugging capabilities, outperforming existing methods while using fewer training resources.
AINeutralarXiv – CS AI · Mar 36/109
🧠Researchers introduce EmCoop, a new benchmark framework for studying cooperation among LLM-based embodied multi-agent systems in dynamic environments. The framework separates cognitive coordination from physical interaction layers and provides process-level metrics to analyze collaboration quality beyond just task completion success.
AIBullisharXiv – CS AI · Mar 36/108
🧠Researchers propose CollabEval, a new multi-agent framework for evaluating AI-generated content that uses collaborative judgment instead of single LLM evaluation. The system implements a three-phase process with multiple AI agents working together to provide more consistent and less biased evaluations than current approaches.
AINeutralarXiv – CS AI · Mar 36/105
🧠Researchers introduce LiveCultureBench, a new benchmark that evaluates large language models as autonomous agents in simulated social environments, testing both task completion and adherence to cultural norms. The benchmark uses a multi-cultural town simulation to assess cross-cultural robustness and the balance between effectiveness and cultural sensitivity in LLM agents.
AIBullisharXiv – CS AI · Mar 36/107
🧠Researchers introduce LiaisonAgent, an autonomous multi-agent cybersecurity system built on the QWQ-32B reasoning model that automates risk investigation and governance for Security Operations Centers. The system achieves 97.8% success rate in tool-calling and 95% accuracy in risk judgment while reducing manual investigation overhead by 92.7%.
AIBullisharXiv – CS AI · Mar 37/106
🧠MOSAIC is a new open-source platform that enables cross-paradigm comparison and evaluation of different AI agents including reinforcement learning, large language models, vision-language models, and human decision-makers within the same environment. The platform introduces three key technical contributions: an IPC-based worker protocol, operator abstraction for unified interfaces, and a deterministic evaluation framework for reproducible research.
AIBullisharXiv – CS AI · Mar 27/1015
🧠Researchers developed MACD, a Multi-Agent Clinical Diagnosis framework that enables large language models to self-learn clinical knowledge and improve medical diagnosis accuracy. The system achieved up to 22.3% improvement over clinical guidelines and 16% improvement over physician-only diagnosis when tested on 4,390 real-world patient cases.
AIBullisharXiv – CS AI · Mar 26/1015
🧠Researchers propose OM2P, a new offline multi-agent reinforcement learning algorithm that achieves efficient one-step action sampling using mean-flow models. The approach delivers up to 3.8x reduction in GPU memory usage and 10.8x speed-up in training time compared to existing diffusion and flow-based models.
AIBullisharXiv – CS AI · Feb 276/107
🧠Researchers developed a multi-agent LLM trading framework that decomposes investment analysis into fine-grained tasks rather than coarse-grained instructions. Testing on Japanese stock data showed the approach significantly improved risk-adjusted returns and achieved superior performance through portfolio optimization.
AINeutralarXiv – CS AI · Feb 275/104
🧠Researchers propose QSIM, a new framework that addresses systematic Q-value overestimation in multi-agent reinforcement learning by using action similarity weighted Q-learning instead of traditional greedy approaches. The method demonstrates improved performance and stability across various value decomposition algorithms through similarity-weighted target calculations.
$NEAR
AIBullisharXiv – CS AI · Feb 276/103
🧠Researchers developed Hierarchical Co-Self-Play (HCSP), a reinforcement learning framework that enables teams of drones to learn complex 3v3 volleyball through a three-stage training process. The system achieved an 82.9% win rate against baselines and demonstrated emergent team behaviors like role switching and coordinated formations.
AIBullisharXiv – CS AI · Feb 276/106
🧠Researchers have introduced ESAA (Event Sourcing for Autonomous Agents), a new architecture that improves LLM-based autonomous agents by separating cognitive intention from state mutation using structured JSON events and deterministic orchestration. The system addresses key limitations like context degradation and execution reliability, with successful validation through multi-agent case studies using various LLMs including Claude Sonnet and GPT-5.
AINeutralArs Technica – AI · Feb 266/107
🧠Perplexity has announced 'Computer,' a new AI agent system that can delegate tasks to other AI agents. The system is positioned as a more controlled and safer alternative to the OpenClaw concept.
AINeutralImport AI (Jack Clark) · Feb 96/104
🧠Import AI 444 covers recent AI research including Google's findings on LLMs simulating multiple personalities, Huawei's use of AI for kernel development, and the introduction of ChipBench. The newsletter focuses on advancing AI research and development across various applications and hardware optimization.
AIBullishOpenAI News · Sep 176/107
🧠Researchers observed AI agents developing increasingly complex strategies through multi-agent interaction in a hide-and-seek game environment. The agents independently discovered six distinct strategies and counterstrategies, some of which were previously unknown to be possible in the environment, suggesting emergent complexity from self-supervised learning.
AIBullishOpenAI News · Jun 256/105
🧠OpenAI Five, a team of five neural networks, has achieved the milestone of defeating amateur human teams at the complex video game Dota 2. This represents a significant advancement in AI's ability to handle complex, multi-agent strategic environments.
AIBullishOpenAI News · Sep 146/108
🧠OpenAI has released LOLA (Learning with Opponent-Learning Awareness), an algorithm that enables AI agents to model and adapt to other learning agents. The system can develop collaborative strategies like tit-for-tat in game theory scenarios while maintaining self-interest.
AINeutralarXiv – CS AI · Mar 175/10
🧠Researchers developed a comprehensive benchmarking system to evaluate AI agent performance in single-cell omics analysis, testing 50 real-world tasks across multiple frameworks. The study found that Grok3-beta achieved state-of-the-art performance, while multi-agent frameworks significantly outperformed single-agent approaches through specialized role division.
🧠 Grok
AINeutralarXiv – CS AI · Mar 54/10
🧠Researchers propose a new approach to world models that combines explicit simulators with learned models using the DEVS formalism. The method uses LLMs to generate discrete-event world models from natural language specifications, targeting environments with event-driven dynamics like queueing systems and multi-agent coordination.
AINeutralarXiv – CS AI · Mar 54/10
🧠Researchers propose a Retrieval-Augmented Generation (RAG) framework with multi-agent architecture to improve knowledge management and workforce training in state transportation departments. The system combines specialized AI agents for document retrieval, answer generation, and quality control, including vision-language models to process technical figures alongside text.
AINeutralarXiv – CS AI · Mar 44/103
🧠Researchers introduce a multi-agent collaboration framework for zero-shot document-level event argument extraction that uses AI agents to generate, evaluate, and refine synthetic training data. The system employs reinforcement learning to iteratively improve both data generation quality and argument extraction performance through a collaborative process.
AINeutralarXiv – CS AI · Mar 35/106
🧠Researchers propose WKGFC, a new AI system that uses knowledge graphs and multi-agent retrieval to improve fact-checking accuracy. The system addresses limitations of current methods that rely on textual similarity by implementing an automated Markov Decision Process with LLM agents to retrieve and verify evidence from multiple sources.
AINeutralarXiv – CS AI · Mar 34/104
🧠Researchers introduce Structured Diversity Control (SDC), a new framework for multi-agent reinforcement learning that improves coordination by controlling behavioral diversity within and between agent groups. The method achieved up to 47.1% improvement in average rewards and 12.82% reduction in episode lengths across various experiments.