AINeutralarXiv – CS AI · 10h ago6/10
🧠
Align and Shine: Building High-Quality Sentence-Aligned Corpora for Multilingual Text Simplification
Researchers have created a multilingual text simplification corpus by collecting and aligning sentence-level data from comparable corpora across five languages (Catalan, English, French, Italian, and Spanish). The dataset addresses a critical gap in NLP resources for non-English languages and is publicly available for training and evaluating text simplification models.