#model-merging News & Analysis

12 articles tagged with #model-merging. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

12 articles

AIBullisharXiv – CS AI · May 127/10

🧠

M2A: Synergizing Mathematical and Agentic Reasoning in Large Language Models

Researchers introduce M2A, a novel model merging paradigm that combines mathematical and agentic reasoning in large language models without retraining. The approach improves a Qwen3-8B model's software engineering benchmark performance from 44.0% to 51.2% by strategically injecting mathematical reasoning capabilities along directions that preserve agent behavior.

AINeutralarXiv – CS AI · Apr 67/10

🧠

One Model to Translate Them All? A Journey to Mount Doom for Multilingual Model Merging

Researchers studied weight-space model merging for multilingual machine translation and found it significantly degrades performance when target languages differ. Analysis reveals that fine-tuning redistributes rather than sharpens language selectivity in neural networks, increasing representational divergence in higher layers that govern text generation.

AINeutralarXiv – CS AI · Mar 117/10

🧠

An Empirical Study and Theoretical Explanation on Task-Level Model-Merging Collapse

Researchers have identified a phenomenon called 'merging collapse' where combining independently fine-tuned large language models leads to catastrophic performance degradation. The study reveals that representational incompatibility between tasks, rather than parameter conflicts, is the primary cause of merging failures.

AIBullisharXiv – CS AI · Mar 47/103

🧠

OptMerge: Unifying Multimodal LLM Capabilities and Modalities via Model Merging

Researchers introduce OptMerge, a new benchmark and method for combining multiple expert Multimodal Large Language Models (MLLMs) into single, more capable models without requiring additional training data. The approach achieves 2.48% average performance gains while reducing storage and serving costs by merging models across different modalities like vision, audio, and video.

AIBullisharXiv – CS AI · Mar 37/103

🧠

AdaRank: Adaptive Rank Pruning for Enhanced Model Merging

Researchers introduce AdaRank, a new AI model merging framework that adaptively selects optimal singular directions from task vectors to combine multiple fine-tuned models. The technique addresses cross-task interference issues in existing SVD-based approaches by dynamically pruning problematic components during test-time, achieving state-of-the-art performance with nearly 1% gap from individual fine-tuned models.

AINeutralarXiv – CS AI · 5d ago6/10

🧠

Model Merging on Loss Landscape: A Geometry Perspective

Researchers introduce EpiMer, a novel framework for merging machine learning models by treating it as a geometric optimization problem on Riemannian manifolds. The method uses low-rank task vectors and curvature information to improve knowledge integration without retraining, demonstrating superior performance when merging fine-tuned CLIP-ViT models across multiple image classification tasks.

AIBullisharXiv – CS AI · Apr 146/10

🧠

Modular Delta Merging with Orthogonal Constraints: A Scalable Framework for Continual and Reversible Model Composition

Researchers introduce Modular Delta Merging with Orthogonal Constraints (MDM-OC), a machine learning framework that enables multiple fine-tuned models to be merged, updated, and selectively removed without performance degradation or task interference. The approach uses orthogonal projections to prevent model conflicts and supports compliance requirements like GDPR-mandated data deletion.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Resolving Interference (RI): Disentangling Models for Improved Model Merging

Researchers have developed Resolving Interference (RI), a new framework that improves AI model merging by reducing cross-task interference when combining specialized models. The method makes models functionally orthogonal to other tasks using only unlabeled data, improving merging performance by up to 3.8% and generalization by up to 2.3%.

AIBullisharXiv – CS AI · Mar 176/10

🧠

ES-Merging: Biological MLLM Merging via Embedding Space Signals

Researchers propose ES-Merging, a new framework for combining specialized biological multimodal large language models (MLLMs) by using embedding space signals rather than traditional parameter-based methods. The approach estimates merging coefficients at both layer-wise and element-wise granularities, outperforming existing merging techniques and even task-specific fine-tuned models on cross-modal scientific problems.

AIBullisharXiv – CS AI · Mar 37/108

🧠

Securing the Floor and Raising the Ceiling: A Merging-based Paradigm for Multi-modal Search Agents

Researchers propose a training-free paradigm for empowering Vision-Language Models with multi-modal search capabilities through cross-modal model merging. The approach uses Optimal Brain Merging (OBM) to combine text-based search agents with base VLMs without requiring expensive supervised training or reinforcement learning.

AINeutralarXiv – CS AI · Mar 54/10

🧠

BD-Merging: Bias-Aware Dynamic Model Merging with Evidence-Guided Contrastive Learning

Researchers introduce BD-Merging, a new AI framework that improves model merging for multi-task learning by addressing bias and distribution shift issues. The method uses uncertainty modeling and contrastive learning to create more reliable AI systems that can better handle real-world data variations.

AINeutralHugging Face Blog · Feb 194/108

🧠

🤗 PEFT welcomes new merging methods

The article title suggests that PEFT (Parameter Efficient Fine-Tuning) has introduced new merging methods. However, the article body appears to be empty or unavailable, limiting detailed analysis of the specific technical developments or their implications.