#multi-agent News & Analysis

89 articles tagged with #multi-agent. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

89 articles

AIBullisharXiv – CS AI · Mar 56/10

🧠

Agile Flight Emerges from Multi-Agent Competitive Racing

Researchers demonstrate that multi-agent competitive training enables AI agents to develop agile flight capabilities and strategic behaviors that outperform traditional single-agent training methods. The approach shows superior sim-to-real transfer and generalization when applied to drone racing scenarios with complex environments and obstacles.

AINeutralarXiv – CS AI · Mar 57/10

🧠

Learning Approximate Nash Equilibria in Cooperative Multi-Agent Reinforcement Learning via Mean-Field Subsampling

Researchers propose ALTERNATING-MARL, a new framework for cooperative multi-agent reinforcement learning that enables a global agent to learn with massive populations under communication constraints. The method achieves approximate Nash equilibrium convergence while only observing a subset of local agent states, with applications in multi-robot control and federated optimization.

$MKR

AIBullisharXiv – CS AI · Mar 57/10

🧠

Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization

Researchers introduce Adversarially-Aligned Jacobian Regularization (AAJR), a new method to improve the robustness of autonomous AI agent systems by controlling sensitivity along adversarial directions rather than globally. This approach maintains better performance while ensuring stability in multi-agent AI ecosystems compared to existing methods.

AIBullisharXiv – CS AI · Mar 57/10

🧠

An LLM Agentic Approach for Legal-Critical Software: A Case Study for Tax Prep Software

Researchers developed a multi-agent LLM system that translates legal statutes into executable software, using U.S. tax preparation as a test case. The system achieved a 45% success rate using GPT-4o-mini, significantly outperforming larger frontier models like GPT-4o and Claude 3.5 which only achieved 9-15% success rates on complex tax code tasks.

🧠 GPT-4🧠 Claude

AIBullisharXiv – CS AI · Mar 46/103

🧠

MA-CoNav: A Master-Slave Multi-Agent Framework with Hierarchical Collaboration and Dual-Level Reflection for Long-Horizon Embodied VLN

Researchers propose MA-CoNav, a multi-agent collaborative framework for robot navigation that uses a Master-Slave architecture to distribute cognitive tasks among specialized agents. The system outperforms existing Vision-Language Navigation methods by decoupling perception, planning, execution, and memory functions across different AI agents with hierarchical collaboration.

AIBullisharXiv – CS AI · Mar 46/102

🧠

AI-for-Science Low-code Platform with Bayesian Adversarial Multi-Agent Framework

Researchers have developed a Bayesian adversarial multi-agent framework for AI-driven scientific code generation, featuring three coordinated LLM agents that work together to improve reliability and reduce errors. The Low-code Platform (LCP) enables non-expert users to generate scientific code through natural language prompts, demonstrating superior performance in benchmark tests and Earth Science applications.

AIBullisharXiv – CS AI · Mar 46/106

🧠

SuperLocalMemory: Privacy-Preserving Multi-Agent Memory with Bayesian Trust Defense Against Memory Poisoning

SuperLocalMemory is a new privacy-preserving memory system for multi-agent AI that defends against memory poisoning attacks through local-first architecture and Bayesian trust scoring. The open-source system eliminates cloud dependencies while providing personalized retrieval through adaptive learning-to-rank, demonstrating strong performance metrics including 10.6ms search latency and 72% trust degradation for sleeper attacks.

AIBullisharXiv – CS AI · Mar 47/102

🧠

Saarthi for AGI: Towards Domain-Specific General Intelligence for Formal Verification

Researchers have enhanced the Saarthi AI framework for formal verification, achieving 70% better accuracy in generating SystemVerilog assertions and 50% fewer iterations to reach coverage closure. The framework uses multi-agent collaboration and improved RAG techniques to move toward domain-specific AI intelligence for verification tasks.

AIBullisharXiv – CS AI · Mar 47/102

🧠

ShareVerse: Multi-Agent Consistent Video Generation for Shared World Modeling

ShareVerse is a new AI video generation framework that enables multiple agents to interact and generate consistent videos within a shared virtual world. The system uses CARLA simulation data and cross-agent attention mechanisms to create 49-frame videos with multi-view consistency across different agents.

AIBullisharXiv – CS AI · Mar 47/103

🧠

BrandFusion: A Multi-Agent Framework for Seamless Brand Integration in Text-to-Video Generation

Researchers introduce BrandFusion, a multi-agent AI framework that enables seamless brand integration into text-to-video generation models. The system addresses commercial monetization challenges in T2V technology by automatically embedding advertiser brands into generated videos while preserving user intent and ensuring natural integration.

AIBullisharXiv – CS AI · Mar 47/103

🧠

MedLA: A Logic-Driven Multi-Agent Framework for Complex Medical Reasoning with Large Language Models

Researchers have developed MedLA, a new logic-driven multi-agent AI framework that uses large language models for complex medical reasoning. The system employs multiple AI agents that organize their reasoning into explicit logical trees and engage in structured discussions to resolve inconsistencies and reach consensus on medical questions.

AIBullisharXiv – CS AI · Mar 46/102

🧠

Multimodal Multi-Agent Ransomware Analysis Using AutoGen

Researchers developed a multimodal multi-agent ransomware analysis framework using AutoGen that combines static, dynamic, and network data sources for improved ransomware detection. The system achieved 0.936 Macro-F1 score for family classification and demonstrated stable convergence over 100 epochs with a final composite score of 0.88.

AIBullisharXiv – CS AI · Mar 46/102

🧠

NeuroWise: A Multi-Agent LLM "Glass-Box" System for Practicing Double-Empathy Communication with Autistic Partners

NeuroWise is a multi-agent LLM system designed to help neurotypical individuals better communicate with autistic partners through AI-based coaching and interpretation. A study of 30 participants showed the system significantly reduced deficit-based thinking about autism and improved communication efficiency by 37%.

AIBullisharXiv – CS AI · Mar 37/105

🧠

Elo-Evolve: A Co-evolutionary Framework for Language Model Alignment

Researchers introduce Elo-Evolve, a new framework for training AI language models using dynamic multi-agent competition instead of static reward functions. The method achieves 4.5x noise reduction and demonstrates superior performance compared to traditional alignment approaches when tested on Qwen2.5-7B models.

AIBullisharXiv – CS AI · Mar 37/103

🧠

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Researchers introduce SPIRAL, a self-play reinforcement learning framework that enables language models to develop reasoning capabilities by playing zero-sum games against themselves without human supervision. The system improves performance by up to 10% across 8 reasoning benchmarks on multiple model families including Qwen and Llama.

AIBullisharXiv – CS AI · Feb 277/105

🧠

CourtGuard: A Model-Agnostic Framework for Zero-Shot Policy Adaptation in LLM Safety

Researchers introduce CourtGuard, a new framework for AI safety that uses retrieval-augmented multi-agent debate to evaluate LLM outputs without requiring expensive retraining. The system achieves state-of-the-art performance across 7 safety benchmarks and demonstrates zero-shot adaptability to new policy requirements, offering a more flexible approach to AI governance.

AINeutralGoogle Research Blog · Jan 287/106

🧠

Towards a science of scaling agent systems: When and why agent systems work

The article discusses the scientific principles behind scaling agent systems in generative AI, examining the conditions and factors that determine when agent systems perform effectively. It appears to focus on understanding the theoretical foundations for building and deploying AI agent systems at scale.

AIBullishOpenAI News · Oct 237/105

🧠

Consensus accelerates research with GPT-5 and Responses API

Consensus has deployed GPT-5 and OpenAI's Responses API to create a multi-agent research assistant that can rapidly read, analyze, and synthesize scientific evidence. The platform serves over 8 million researchers and aims to accelerate scientific discovery by automating research processes that previously took much longer.

AIBullishOpenAI News · Mar 167/104

🧠

Learning to communicate

OpenAI has published new research demonstrating that AI agents can develop their own communication language. This research explores emergent communication capabilities in artificial intelligence systems.

AINeutralarXiv – CS AI · 4d ago6/10

🧠

Simulation-Informed Diffusion for Decentralized Multi-robot Motion Planning

Researchers introduce Simulation-Informed Diffusion (SID), a decentralized multi-robot motion planning framework that predicts neighboring robot trajectories to enable collision-free path planning without global communication. The approach scales to 108 robots and 160 obstacles while triggering coordination only when necessary, outperforming existing classical and learning-based planners.

AINeutralarXiv – CS AI · May 116/10

🧠

Active teacher selection for reward learning

Researchers introduce the Hidden Utility Bandit (HUB) framework to address a critical limitation in reward learning systems: their reliance on feedback from a single idealized teacher. The framework models teacher heterogeneity in rationality, expertise, and cost, enabling Active Teacher Selection (ATS) algorithms that strategically choose which teachers to query, demonstrating superior performance in paper recommendation and vaccine testing applications.

AINeutralarXiv – CS AI · May 96/10

🧠

Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning

Skill1 presents a unified reinforcement learning framework that enables language model agents to co-evolve three coupled capabilities: skill selection, utilization, and distillation from a single task-outcome reward signal. Demonstrated improvements over existing baselines on complex tasks suggest advances in how AI agents can build and leverage persistent skill libraries across diverse problem domains.

AINeutralarXiv – CS AI · Apr 76/10

🧠

Implementing surrogate goals for safer bargaining in LLM-based agents

Researchers developed methods to implement 'surrogate goals' in LLM-based agents to reduce bargaining risks by deflecting threats away from what principals care about. The study tested four approaches (prompting, fine-tuning, scaffolding) and found that scaffolding and fine-tuning methods outperformed simple prompting for implementing desired threat response behaviors.

AINeutralarXiv – CS AI · Mar 176/10

🧠

InterveneBench: Benchmarking LLMs for Intervention Reasoning and Causal Study Design in Real Social Systems

Researchers introduced InterveneBench, a new benchmark comprising 744 peer-reviewed studies to evaluate large language models' ability to reason about policy interventions and causal inference in social science contexts. Current state-of-the-art LLMs struggle with this type of reasoning, prompting the development of STRIDES, a multi-agent framework that significantly improves performance on these tasks.

AIBullisharXiv – CS AI · Mar 176/10

🧠

EvolvR: Self-Evolving Pairwise Reasoning for Story Evaluation to Enhance Generation

Researchers have developed EvolvR, a self-evolving framework that improves AI's ability to evaluate and generate stories through pairwise reasoning and multi-agent data filtering. The system achieves state-of-the-art performance on three evaluation benchmarks and significantly enhances story generation quality when used as a reward model.

← PrevPage 2 of 4Next →