Subspace-Constrained Federated Learning with Low-Rank Adaptation
Researchers propose a subspace-regularized federated learning approach for low-rank adaptation (LoRA) that addresses geometric misalignment issues when training large language models across distributed clients with heterogeneous data. The method achieves superior performance on RoBERTa-large while demonstrating near-perfect basis overlap (0.9999) across multiple models and random seeds, outperforming existing federated learning baselines.
Federated learning enables collaborative model training while preserving data privacy, but the heterogeneity of client datasets creates a fundamental challenge: local low-rank updates diverge geometrically, leading to destructive aggregation during model synchronization. This research tackles a critical problem in distributed machine learning by introducing subspace regularization that constrains local client updates toward a shared global reference subspace, reducing the misalignment that degrades convergence speed and final model quality.
The work extends LoRA-based federated learning, which has gained prominence as organizations seek efficient fine-tuning methods for large models under communication constraints. Previous approaches like FedAvg and FedSVD fail to account for the geometric structure of local updates, resulting in suboptimal aggregation. By enforcing alignment through subspace constraints, the authors provide both theoretical motivation and empirical validation across multiple experimental configurations.
The experimental results reveal nuanced trade-offs in federated learning optimization. On RoBERTa-large, Subspace-Reg substantially outperforms baselines with 0.454 mean best accuracy and 0.429 final accuracy. However, on SmolLM-360M, FedAvg performs better, indicating that geometric alignment benefits depend on model architecture and scale. The near-perfect basis overlap metric (0.9999) versus baseline performance (0.958-0.991) provides compelling evidence that subspace alignment directly correlates with convergence stability.
For practitioners deploying federated learning systems, this research offers a practical improvement mechanism for large-scale model fine-tuning. The public code availability accelerates adoption potential. The model-dependent performance variation suggests future work should focus on identifying which architectures benefit most from subspace regularization, guiding deployment decisions.
- βSubspace regularization in federated LoRA achieves 0.9999 basis overlap versus 0.958-0.991 for existing methods, validating the geometric alignment hypothesis.
- βRoBERTa-large shows significant improvements with Subspace-Reg, but SmolLM-360M favors FedAvg, indicating model-dependent optimization trade-offs.
- βThe method addresses destructive aggregation caused by heterogeneous client data in federated learning through geometric constraints.
- βComprehensive evaluation across 24 experimental runs (4 methods Γ 3 seeds Γ 2 models) provides statistically robust performance validation.
- βPublic code availability enables rapid integration of subspace-regularized federated learning into production systems.