AINeutralarXiv – CS AI · 9h ago6/10
🧠
TeamBench: Evaluating Agent Coordination under Enforced Role Separation
TeamBench is a new benchmark evaluating multi-agent AI coordination under enforced role separation, revealing that prompt-only instructions fail to prevent role violations and that agent teams often underperform single agents on well-solved tasks. The study demonstrates that passing rates can mask coordination failures and misaligned team dynamics.