🧠 AI⚪ NeutralImportance 7/10

CRAFT: Grounded Multi-Agent Coordination Under Partial Information

arXiv – CS AI|Abhijnan Nath, Hannah VanderHoeven, Nikhil Krishnaswamy|March 27, 2026 at 04:00 AM

🤖AI Summary

Researchers introduce CRAFT, a multi-agent benchmark that evaluates how well large language models coordinate through natural language communication under partial information constraints. The study finds that stronger reasoning abilities don't reliably translate to better coordination, with smaller open-weight models often matching or outperforming frontier systems in collaborative tasks.

Key Takeaways

→CRAFT benchmark tests multi-agent coordination in LLMs where agents must collaborate with incomplete information to build shared 3D structures.
→Study evaluated 15 models including 8 open-weight and 7 frontier models across spatial grounding, belief modeling, and pragmatic communication.
→Stronger individual reasoning capabilities do not guarantee better multi-agent coordination performance.
→Smaller open-weight models frequently matched or exceeded the performance of larger frontier systems in coordination tasks.
→Multi-agent coordination remains a fundamentally unsolved challenge for current language model architectures.