#multilingual-llms News & Analysis

5 articles tagged with #multilingual-llms. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

5 articles

AINeutralarXiv – CS AI · Apr 147/10

🧠

LiveCLKTBench: Towards Reliable Evaluation of Cross-Lingual Knowledge Transfer in Multilingual LLMs

Researchers introduce LiveCLKTBench, an automated benchmark for evaluating how well multilingual large language models transfer knowledge across languages, addressing the challenge of distinguishing genuine cross-lingual transfer from pre-training artifacts. Testing across five languages reveals that transfer effectiveness depends heavily on linguistic distance, model scale, and domain, with improvements plateauing in larger models.

AINeutralarXiv – CS AI · May 296/10

🧠

Beyond Bilingual Transfer: Multilingual Code-Switching in Instruction Tuning

Researchers demonstrate that multilingual code-switching—mixing multiple languages within training data—improves large language model performance across four languages (English, Japanese, Korean, Chinese) simultaneously, extending previous bilingual findings to truly multilingual settings and showing consistent performance gains on cross-lingual benchmarks.

AINeutralarXiv – CS AI · May 286/10

🧠

DEPART: DEcomposing PARiTy across Multilingual LLMs

Researchers introduce DEPART, a Bayesian framework that systematically decomposes performance disparities across multilingual large language models into interpretable components. The study reveals that language features and representational similarity to English explain 79-92% of variance, with model identity dominating NLU tasks while benchmark-model interactions drive reasoning task differences.

AINeutralarXiv – CS AI · May 286/10

🧠

Routing-Aligned Fine-Tuning for Multilingual Downstream Tasks in Mixture-of-Experts Models

Researchers propose RA-MoE, a fine-tuning framework that optimizes Mixture-of-Experts language models for multilingual tasks by aligning target-language routing patterns with English task performance in middle layers. The approach outperforms standard fine-tuning across multiple models and languages, addressing a critical gap in adapting efficient LLM architectures for non-English downstream applications.

AINeutralarXiv – CS AI · May 286/10

🧠

Towards Reliable Multilingual LLMs-as-a-Judge: An Empirical Study

Researchers develop strategies for extending large language models as evaluation tools to multilingual settings, addressing challenges in low-resource languages. The study reveals that fine-tuned smaller models match proprietary performance when in-domain data exists, while larger zero-shot models excel in out-of-domain scenarios, providing practical guidance for building multilingual evaluation systems.