AIBullisharXiv – CS AI · 6h ago6/10
🧠
LoRDO: Distributed Low-Rank Optimization with Infrequent Communication
Researchers introduce LoRDO, a distributed optimization framework that combines low-rank techniques with infrequent communication to reduce bandwidth requirements in foundation model training by approximately 10x. The method addresses a critical bottleneck in distributed training by enabling workers to perform effective low-rank projections without full-batch gradient access, achieving near-parity performance with standard distributed training at model scales of 125M-720M parameters.