AINeutralarXiv – CS AI · 15h ago6/10
🧠
UnityMAS-O: A General RL Optimization Framework for LLM-Based Multi-Agent Systems
UnityMAS-O is a new reinforcement learning optimization framework that enables LLM-based multi-agent systems to be trained end-to-end rather than manually orchestrated. The framework treats entire agent workflows as optimization units and demonstrates performance improvements across QA, search, and code generation tasks, particularly benefiting smaller models.