AINeutralarXiv – CS AI · 14h ago6/10
🧠
Differentiable Belief-based Opponent Shaping
Researchers introduce Differentiable Belief-based Opponent Shaping (D-BOS), a novel multi-agent reinforcement learning method that shapes opponent behavior by differentiating through their belief states rather than manipulating parameters or policies directly. The approach demonstrates superior performance in hidden-role games compared to existing methods like PPO and BBM, with particular effectiveness in mixed-motive scenarios.