Researchers introduce EpiMer, a novel framework for merging machine learning models by treating it as a geometric optimization problem on Riemannian manifolds. The method uses low-rank task vectors and curvature information to improve knowledge integration without retraining, demonstrating superior performance when merging fine-tuned CLIP-ViT models across multiple image classification tasks.
EpiMer addresses a fundamental challenge in machine learning: how to efficiently combine knowledge from multiple specialized models without the computational cost of retraining. The framework's geometric perspective—casting model merging as finding the Fréchet mean on a Riemannian manifold—provides theoretical grounding for why considering the loss landscape's curvature matters. By restricting computation to low-rank subspaces spanned by task vectors, the approach maintains computational tractability while capturing essential geometric information.
The research builds on growing interest in parameter-efficient model adaptation and multi-task learning. As organizations increasingly fine-tune large foundation models for specific applications, the ability to merge these specialized models efficiently becomes economically valuable. Prior methods either ignored loss landscape geometry entirely or required expensive full-space Hessian computations, leaving a practical gap that EpiMer addresses.
The theoretical contribution linking local curvature to epistemic uncertainty reveals why some parameter regions are more sensitive to merging decisions than others. The framework's ability to unify both curvature-aware and spectral methods under a single mathematical formulation demonstrates generality and provides cleaner theoretical characterization of when each approach excels.
Empirical validation on CLIP-ViT models across eight image classification tasks shows consistent improvements over baselines at matched model capacity. The improvements in both average and worst-task accuracy indicate the method handles diverse task distributions effectively. For practitioners deploying multiple specialized models, this work suggests curvature-aware merging could reduce inference infrastructure costs while maintaining or improving performance.
- →EpiMer frames model merging as Fréchet mean optimization on Riemannian manifolds using curvature information
- →The method restricts computation to low-rank task vector subspaces, making it practical compared to full-space Hessian approaches
- →Theoretical analysis proves when curvature-aware merging outperforms flat-geometry methods through decomposed error bounds
- →Framework unifies curvature-aware and spectral merging methods as special cases with different geometric metrics
- →Empirical results show consistent improvements on CLIP-ViT across eight image classification tasks at matched rank