AINeutralarXiv – CS AI · 18h ago6/10
🧠
Unsupervised Partner Design Enables Robust Ad-hoc Teamwork
Researchers introduce Unsupervised Partner Design (UPD), a multi-agent reinforcement learning method that generates and adaptively selects training partners without requiring pre-trained populations or manual tuning. The approach demonstrates strong performance across multiple benchmarks and achieves higher human preference ratings for adaptability and naturalness compared to existing baselines.