y0news
AnalyticsDigestsSourcesRSSAICrypto
#continued-pretraining1 article
1 articles
AIBullisharXiv โ€“ CS AI ยท 4d ago5/104
๐Ÿง 

EstLLM: Enhancing Estonian Capabilities in Multilingual LLMs via Continued Pretraining and Post-Training

Researchers developed EstLLM, enhancing Estonian language capabilities in multilingual LLMs through continued pretraining of Llama 3.1 8B with balanced data mixtures. The approach improved Estonian linguistic performance while maintaining English capabilities, demonstrating that targeted continued pretraining can substantially improve single-language performance in multilingual models.