AINeutralarXiv – CS AI · 10h ago6/10
🧠
CalBench: Evaluating Coordination-Privacy Trade-offs in Multi-Agent LLMs
Researchers introduce CalBench, a controlled evaluation framework for testing multi-agent LLM coordination in calendar scheduling scenarios where agents must negotiate shared commitments while protecting private information. The benchmark measures coordination quality, communication efficiency, fairness, and privacy leakage in decentralized systems where no single agent has complete information.
🏢 Meta