10 articles tagged with #model-merging. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AINeutralarXiv โ CS AI ยท Apr 67/10
๐ง Researchers studied weight-space model merging for multilingual machine translation and found it significantly degrades performance when target languages differ. Analysis reveals that fine-tuning redistributes rather than sharpens language selectivity in neural networks, increasing representational divergence in higher layers that govern text generation.
AINeutralarXiv โ CS AI ยท Mar 117/10
๐ง Researchers have identified a phenomenon called 'merging collapse' where combining independently fine-tuned large language models leads to catastrophic performance degradation. The study reveals that representational incompatibility between tasks, rather than parameter conflicts, is the primary cause of merging failures.
AIBullisharXiv โ CS AI ยท Mar 47/103
๐ง Researchers introduce OptMerge, a new benchmark and method for combining multiple expert Multimodal Large Language Models (MLLMs) into single, more capable models without requiring additional training data. The approach achieves 2.48% average performance gains while reducing storage and serving costs by merging models across different modalities like vision, audio, and video.
AIBullisharXiv โ CS AI ยท Mar 37/103
๐ง Researchers introduce AdaRank, a new AI model merging framework that adaptively selects optimal singular directions from task vectors to combine multiple fine-tuned models. The technique addresses cross-task interference issues in existing SVD-based approaches by dynamically pruning problematic components during test-time, achieving state-of-the-art performance with nearly 1% gap from individual fine-tuned models.
AIBullisharXiv โ CS AI ยท 3d ago6/10
๐ง Researchers introduce Modular Delta Merging with Orthogonal Constraints (MDM-OC), a machine learning framework that enables multiple fine-tuned models to be merged, updated, and selectively removed without performance degradation or task interference. The approach uses orthogonal projections to prevent model conflicts and supports compliance requirements like GDPR-mandated data deletion.
AIBullisharXiv โ CS AI ยท Mar 176/10
๐ง Researchers have developed Resolving Interference (RI), a new framework that improves AI model merging by reducing cross-task interference when combining specialized models. The method makes models functionally orthogonal to other tasks using only unlabeled data, improving merging performance by up to 3.8% and generalization by up to 2.3%.
AIBullisharXiv โ CS AI ยท Mar 176/10
๐ง Researchers propose ES-Merging, a new framework for combining specialized biological multimodal large language models (MLLMs) by using embedding space signals rather than traditional parameter-based methods. The approach estimates merging coefficients at both layer-wise and element-wise granularities, outperforming existing merging techniques and even task-specific fine-tuned models on cross-modal scientific problems.
AIBullisharXiv โ CS AI ยท Mar 37/108
๐ง Researchers propose a training-free paradigm for empowering Vision-Language Models with multi-modal search capabilities through cross-modal model merging. The approach uses Optimal Brain Merging (OBM) to combine text-based search agents with base VLMs without requiring expensive supervised training or reinforcement learning.
AINeutralarXiv โ CS AI ยท Mar 54/10
๐ง Researchers introduce BD-Merging, a new AI framework that improves model merging for multi-task learning by addressing bias and distribution shift issues. The method uses uncertainty modeling and contrastive learning to create more reliable AI systems that can better handle real-world data variations.
AINeutralHugging Face Blog ยท Feb 194/108
๐ง The article title suggests that PEFT (Parameter Efficient Fine-Tuning) has introduced new merging methods. However, the article body appears to be empty or unavailable, limiting detailed analysis of the specific technical developments or their implications.