AIBullisharXiv โ CS AI ยท 5d ago6/104
๐ง
MENLO: From Preferences to Proficiency -- Evaluating and Modeling Native-like Quality Across 47 Languages
Researchers introduce MENLO, a new framework for evaluating native-like quality in large language model responses across 47 languages. The study reveals significant improvements in multilingual LLM performance through reinforcement learning and fine-tuning, though gaps with human judgment persist.