AINeutralarXiv – CS AI · 7h ago6/10
🧠
On the Robustness of Multilingual Text Embedding Rankings Across Learning Tasks, Languages, and Benchmark Datasets
Researchers conducted a comprehensive meta-study evaluating the robustness of multilingual text embedding models across 230+ languages using the MTEB benchmark platform. The analysis reveals that LLM-based models show task-specific strengths but few models consistently perform well across all tasks and evaluation methods, highlighting how benchmarking conclusions depend heavily on dataset composition and aggregation methodology choices.