y0news
#training-efficiency1 article
1 articles
AIBullisharXiv โ€“ CS AI ยท 6h ago5
๐Ÿง 

OM2P: Offline Multi-Agent Mean-Flow Policy

Researchers propose OM2P, a new offline multi-agent reinforcement learning algorithm that achieves efficient one-step action sampling using mean-flow models. The approach delivers up to 3.8x reduction in GPU memory usage and 10.8x speed-up in training time compared to existing diffusion and flow-based models.