AINeutralarXiv – CS AI · 3h ago6/10
🧠
Routing-Aligned Fine-Tuning for Multilingual Downstream Tasks in Mixture-of-Experts Models
Researchers propose RA-MoE, a fine-tuning framework that optimizes Mixture-of-Experts language models for multilingual tasks by aligning target-language routing patterns with English task performance in middle layers. The approach outperforms standard fine-tuning across multiple models and languages, addressing a critical gap in adapting efficient LLM architectures for non-English downstream applications.