y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#multilingual-ai News & Analysis

51 articles tagged with #multilingual-ai. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

51 articles
AINeutralarXiv – CS AI · May 126/10
🧠

Bangla-WhisperDiar: Fine-Tuning Whisper and PyAnnote for Bangla Long-Form Speech Recognition and Speaker Diarization

Researchers have developed Bangla-WhisperDiar, a fine-tuned speech recognition and speaker diarization system that achieves a 24.41% word error rate for ASR and 23.92% diarization error rate. The work addresses critical gaps in Bangla language processing by combining OpenAI's Whisper model with PyAnnote's diarization framework, trained on custom datasets with extensive data augmentation techniques.

AINeutralarXiv – CS AI · May 125/10
🧠

Improving Lexical Difficulty Prediction with Context-Aligned Contrastive Learning and Ridge Ensembling

Researchers propose Context-Aligned Contrastive Regression, a machine learning approach that combines contrastive learning with ridge regression ensembling to improve lexical difficulty prediction across multiple language backgrounds. The method addresses limitations in existing regression-only models by structuring representation spaces to better capture cross-lingual alignment and ordinal difficulty rankings, showing improved performance stability across difficulty levels.

AINeutralarXiv – CS AI · May 116/10
🧠

Multilingual Safety Alignment via Self-Distillation

Researchers propose Multilingual Self-Distillation (MSD), a framework that transfers safety safeguards from high-resource languages like English to vulnerable low-resource languages in large language models. The method eliminates the need for expensive multilingual response data by leveraging an LLM's existing safety capabilities, demonstrating effective cross-lingual protection across diverse jailbreak benchmarks.

AIBullisharXiv – CS AI · May 96/10
🧠

ANGOFA: Leveraging OFA Embedding Initialization and Synthetic Data for Angolan Language Model

Researchers introduced ANGOFA, four pre-trained language models tailored for Angolan languages using Multilingual Adaptive Fine-tuning (MAFT) with OFA embedding initialization and synthetic data. The approach achieved 12.3 and 3.8 point improvements over previous state-of-the-art models, addressing a critical gap in NLP support for very-low resource African languages.

AINeutralarXiv – CS AI · Apr 206/10
🧠

Beyond MCQ: An Open-Ended Arabic Cultural QA Benchmark with Dialect Variants

Researchers have created the first comprehensive Arabic Cultural QA benchmark that translates questions across Modern Standard Arabic and regional dialects, converting multiple-choice questions into open-ended formats. Testing reveals that large language models significantly underperform on dialectal content and struggle with open-ended Arabic questions, highlighting critical gaps in culturally grounded language understanding.

AINeutralarXiv – CS AI · Apr 146/10
🧠

Computational Lesions in Multilingual Language Models Separate Shared and Language-specific Brain Alignment

Researchers used computational lesions on multilingual large language models to identify how the brain processes language across different languages. By selectively disabling parameters, they found that a shared computational core handles 60% of multilingual processing, while language-specific components fine-tune predictions for individual languages, providing new insights into how multilingual AI aligns with human neurobiology.

AIBullisharXiv – CS AI · Apr 146/10
🧠

Advancing Polish Language Modeling through Tokenizer Optimization in the Bielik v3 7B and 11B Series

Researchers have optimized the Bielik v3 language models (7B and 11B parameters) by replacing universal tokenizers with Polish-specific vocabulary, addressing inefficiencies in morphological representation. This optimization reduces token fertility, lowers inference costs, and expands effective context windows while maintaining multilingual capabilities through advanced training techniques including supervised fine-tuning and reinforcement learning.

AINeutralarXiv – CS AI · Apr 146/10
🧠

Why Do Multilingual Reasoning Gaps Emerge in Reasoning Language Models?

Researchers identify that reasoning language models exhibit worse performance in low-resource languages due to failures in language understanding rather than reasoning capability itself. The study proposes Selective Translation, which strategically adds English translations only when understanding failures are detected, achieving near full-translation performance while translating just 20% of inputs.

AINeutralarXiv – CS AI · Apr 136/10
🧠

Litmus (Re)Agent: A Benchmark and Agentic System for Predictive Evaluation of Multilingual Models

Researchers introduce Litmus (Re)Agent, an agentic system that predicts how multilingual AI models will perform on tasks lacking direct benchmark data. Using a controlled benchmark of 1,500 questions across six tasks, the system decomposes queries into hypotheses and synthesizes predictions through structured reasoning, outperforming competing approaches particularly when direct evidence is sparse.

AIBullisharXiv – CS AI · Apr 106/10
🧠

FLeX: Fourier-based Low-rank EXpansion for multilingual transfer

Researchers propose FLeX, a parameter-efficient fine-tuning approach combining LoRA, advanced optimizers, and Fourier-based regularization to enable cross-lingual code generation across programming languages. The method achieves 42.1% pass@1 on Java tasks compared to a 34.2% baseline, demonstrating significant improvements in multilingual transfer without full model retraining.

🧠 Llama
AINeutralarXiv – CS AI · Apr 76/10
🧠

Extracting and Steering Emotion Representations in Small Language Models: A Methodological Comparison

Researchers conducted the first comprehensive analysis of emotion representations in small language models (100M-10B parameters), finding that these models do possess internal emotion vectors similar to larger frontier models. The study evaluated 9 models across 5 architectural families and discovered that emotion representations localize at middle transformer layers, with generation-based extraction methods proving superior to comprehension-based approaches.

🏢 Perplexity🧠 Llama
AINeutralarXiv – CS AI · Apr 76/10
🧠

What Makes Good Multilingual Reasoning? Disentangling Reasoning Traces with Measurable Features

Researchers challenge the assumption that multilingual AI reasoning should simply mimic English patterns, finding that effective reasoning features vary significantly across languages. The study analyzed Large Reasoning Models across 10 languages and discovered that English-derived reasoning approaches may not translate effectively to other languages, suggesting need for adaptive, language-specific AI training methods.

AIBearisharXiv – CS AI · Apr 76/10
🧠

Metaphors We Compute By: A Computational Audit of Cultural Translation vs. Thinking in LLMs

New research reveals that Large Language Models (LLMs) exhibit cultural bias and Western defaultism when generating metaphors across different cultural contexts. The study found that LLMs act more as cultural translators using dominant Western frameworks rather than true culturally-aware reasoning systems, even when prompted with specific cultural identities.

AINeutralarXiv – CS AI · Mar 116/10
🧠

CRANE: Causal Relevance Analysis of Language-Specific Neurons in Multilingual Large Language Models

Researchers introduce CRANE, a new framework for analyzing how multilingual large language models organize language capabilities at the neuron level. The method uses targeted interventions to identify language-specific neurons based on functional necessity rather than activation patterns, revealing asymmetric specialization where neurons contribute selectively to specific languages while maintaining broader functionality.

AIBullisharXiv – CS AI · Mar 37/108
🧠

Unified Vision-Language Modeling via Concept Space Alignment

Researchers introduce V-SONAR, a vision-language embedding system that extends text-only SONAR to support 1500+ languages with vision capabilities. The system demonstrates state-of-the-art performance on video captioning and multilingual vision tasks through V-LCM, which combines vision and language processing in a unified framework.

AIBullishOpenAI News · Nov 36/105
🧠

Introducing IndQA

OpenAI has launched IndQA, a new benchmark designed to evaluate AI systems' performance in Indian languages and cultural contexts. The benchmark covers 12 languages and 10 knowledge areas, developed in collaboration with domain experts to test cultural understanding and reasoning capabilities.

AIBullishNVIDIA AI Blog · Sep 146/102
🧠

Reaching Across the Isles: UK-LLM Brings AI to UK Languages With NVIDIA Nemotron

The UK-LLM sovereign AI initiative is developing an AI model based on NVIDIA Nemotron that can reason in both English and Welsh, targeting Wales' 850,000 Welsh speakers. This effort aims to preserve and empower Celtic languages including Cornish, Irish, Scottish Gaelic, and Welsh through advanced AI technology.

Reaching Across the Isles: UK-LLM Brings AI to UK Languages With NVIDIA Nemotron
AIBullishHugging Face Blog · Aug 16/107
🧠

📚 3LM: A Benchmark for Arabic LLMs in STEM and Code

3LM introduces a new benchmark specifically designed to evaluate Arabic Large Language Models (LLMs) in STEM subjects and coding tasks. This benchmark addresses the gap in Arabic language evaluation tools for technical domains, providing a standardized way to assess AI model performance in Arabic scientific and programming contexts.

AIBullishHugging Face Blog · May 146/106
🧠

Introducing the Open Arabic LLM Leaderboard

The article introduces the Open Arabic LLM Leaderboard, a new evaluation platform for Arabic language large language models. This initiative addresses the need for standardized benchmarking of AI models specifically designed for Arabic language processing and understanding.

AINeutralarXiv – CS AI · Mar 54/10
🧠

Raising Bars, Not Parameters: LilMoo Compact Language Model for Hindi

Researchers have developed LilMoo, a 0.6-billion parameter Hindi language model trained from scratch using a transparent, reproducible pipeline optimized for limited compute environments. The model outperforms similarly sized multilingual baselines like Qwen2.5-0.5B and Qwen3-0.6B, demonstrating that language-specific pretraining can rival larger multilingual models.

← PrevPage 2 of 3Next →