AIBullisharXiv – CS AI · 4d ago7/10
🧠Researchers propose a modular state-estimation layer that enhances pre-trained multi-agent reinforcement learning (MARL) policies by compensating for communication delays and packet loss through learned dynamics filtering. The plug-and-play approach combines gated transition models with Kalman filtering to estimate current states from delayed observations, demonstrating significant robustness improvements without requiring retraining of original policies.
AIBullisharXiv – CS AI · Mar 57/10
🧠Researchers developed HALyPO (Heterogeneous-Agent Lyapunov Policy Optimization), a new approach to improve stability in human-robot collaboration through multi-agent reinforcement learning. The method addresses the 'rationality gap' between human and robot learning by using Lyapunov stability conditions to prevent policy oscillations and divergence during training.
AINeutralarXiv – CS AI · May 116/10
🧠Researchers propose Diamond Attention, a neural architecture using structured randomness to enable role differentiation in multi-agent reinforcement learning systems with identical agents. The method achieves perfect coordination on symmetric games and generalizes zero-shot across different team sizes, demonstrating that protocol-structured randomness—not noise—is essential for solving coordination problems in homogeneous agent systems.
AINeutralarXiv – CS AI · May 116/10
🧠Researchers introduce a family of deterministic games designed to test Multi-Agent Reinforcement Learning (MARL) scalability for decentralized UAV swarm control tasked with relaying critical data. While baseline policies using Dijkstra's algorithm perform comparably to standard MARL algorithms for small agent counts, existing MARL approaches demonstrate significant scalability limitations as swarm size increases.
AINeutralarXiv – CS AI · May 116/10
🧠Researchers deployed AlphaEvolve, an LLM-powered evolutionary coding framework, to automatically discover new multi-agent reinforcement learning algorithms for imperfect-information games. The system produced two competitive algorithms (VAD-CFR and SHOR-PSRO) that match human-designed baselines, but further analysis revealed that distilled, minimal versions (WOP-CFR and PM-PSRO) generalize better with simpler structures, demonstrating that LLM-discovered complexity often obscures fundamental algorithmic principles.
AINeutralarXiv – CS AI · May 76/10
🧠Researchers propose CL-MARL, a curriculum learning framework for multi-agent reinforcement learning that dynamically adjusts task difficulty based on agent performance, addressing a fundamental limitation where fixed-difficulty training constrains policy generalization. The method achieves 40% win rate on complex cooperative tasks, outperforming existing baselines by significant margins.
AIBullisharXiv – CS AI · Mar 176/10
🧠Researchers propose MA-VLCM, a framework that uses pretrained vision-language models as centralized critics in multi-agent reinforcement learning instead of learning critics from scratch. This approach significantly improves sample efficiency and enables zero-shot generalization while producing compact policies suitable for resource-constrained robots.
AIBullisharXiv – CS AI · Mar 27/1020
🧠Researchers developed a new multi-agent reinforcement learning algorithm that uses strategic risk aversion to create AI agents that can reliably collaborate with unseen partners. The approach addresses the problem of brittle AI collaboration systems that fail when working with new partners by incorporating robustness against behavioral deviations.
AINeutralarXiv – CS AI · Mar 114/10
🧠Researchers propose CORA, a new cooperative game-theoretic method for credit assignment in multi-agent reinforcement learning that uses coalition-wise advantage allocation. The approach addresses policy optimization challenges by evaluating marginal contributions of different agent coalitions and demonstrates superior performance across various benchmarks.
AINeutralarXiv – CS AI · Mar 34/103
🧠Researchers propose a new multi-agent reinforcement learning framework that addresses communication constraints in real-world scenarios. The approach uses communication-constrained priors to distinguish between lossy and lossless messages, improving learning effectiveness in complex environments with unreliable communication.
AINeutralarXiv – CS AI · Mar 34/105
🧠Researchers develop mathematical framework for decentralized control systems in non-square systems, with applications extending to Multi-Agent Reinforcement Learning (MARL) environments. The work introduces D-stability concepts for non-square matrices and proposes methods to identify stable control pairings for distributed AI architectures.
$LINK