AIBullisharXiv โ CS AI ยท 5h ago1
๐ง
OptMerge: Unifying Multimodal LLM Capabilities and Modalities via Model Merging
Researchers introduce OptMerge, a new benchmark and method for combining multiple expert Multimodal Large Language Models (MLLMs) into single, more capable models without requiring additional training data. The approach achieves 2.48% average performance gains while reducing storage and serving costs by merging models across different modalities like vision, audio, and video.