MARLaaS: Multi-Tenant Asynchronous Reinforcement Learning as a Service
Researchers introduce MARLaaS, a system enabling cost-effective concurrent reinforcement learning fine-tuning for large language models across multiple users through shared base models and asynchronous architecture. The approach achieves 4.3x better accelerator utilization and 85% reduction in training time while maintaining single-task performance quality.