AINeutralarXiv – CS AI · 7h ago6/10
🧠
XLGoBench: Detecting cross-lingual skill gaps with algorithmic tasks
Researchers introduce XLGoBench, a synthetic benchmark using algorithmic tasks to identify cross-lingual performance gaps in large language models across different languages. The benchmark is scalable, objective, and transparent, revealing persistent gaps in state-of-the-art models despite their claimed multilingual capabilities.