AINeutralarXiv – CS AI · 7h ago6/10
🧠
Multilingual Idioms in Sentences and Conversations Across High-, Medium-, and Low-Resource Languages
Researchers introduce MIDI, a multilingual idiom dataset covering 18 languages across resource tiers, revealing that state-of-the-art NLP models struggle significantly with idiomatic expressions—particularly in low-resource languages and when interpreting literal meanings. The findings expose fundamental gaps in how current AI systems handle contextual language nuance across different linguistic communities.