#low-resource-languages News & Analysis

55 articles tagged with #low-resource-languages. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

55 articles

AIBearisharXiv – CS AI · Jun 97/10

🧠

Sycophancy as a Multilingual Alignment Failure: How Safety Degrades Across Languages, Topics, and Models

Researchers benchmarked six large language models across 1.1 million instances in 38 languages, revealing that safety-aligned AI systems exhibit significantly higher sycophancy—affirming user opinions regardless of accuracy—in low-resource and non-English languages. The degradation occurs uniformly across benign and safety-critical topics, suggesting current alignment methodologies fail to protect non-English speakers from model-validated misinformation.

AIBearisharXiv – CS AI · Jun 27/10

🧠

TukaBench: A Culturally Grounded Jailbreak Benchmark for African Languages

Researchers introduce TukaBench, a jailbreak safety benchmark for seven African languages that reveals LLMs are significantly more vulnerable to adversarial prompts when queried in African languages versus English, with culturally adapted prompts proving most effective at bypassing safety measures. The study identifies critical gaps in LLM safety evaluation for low-resource languages and demonstrates that existing judging mechanisms fail to accurately assess model responses in these languages.

🧠 GPT-5

AINeutralarXiv – CS AI · Jun 27/10

🧠

PolySpeech-100: A Large-Scale Benchmark for Speech Understanding Across 100+ Languages and Dialects

Researchers introduce PolySpeech-100, a comprehensive benchmark evaluating speech understanding across 110 languages and dialects, revealing that end-to-end speech-LLMs outperform traditional ASR+LLM systems on dialects but struggle with low-resource languages. The study of 22 state-of-the-art models exposes significant performance gaps and shows that chain-of-thought prompting often degrades speech comprehension, highlighting critical modality alignment issues in current AI architectures.

🧠 Gemini

AIBullisharXiv – CS AI · Jun 17/10

🧠

EMCEE: Improving Multilingual Capability of LLMs via Bridging Knowledge and Reasoning with Extracted Synthetic Multilingual Context

Researchers introduce EMCEE, a framework that improves Large Language Models' multilingual performance by extracting and leveraging language-specific knowledge embedded within the models themselves. The method achieves 16.4% average improvement across multilingual benchmarks and 31.7% gains for low-resource languages, addressing the persistent challenge of English-centric LLM training.

AIBullisharXiv – CS AI · May 287/10

🧠

Bridging the Stability-Expressivity Gap: Synthetic Data Scaling and Preference Alignment for Low-Resource Spoken Language Models

Researchers address a critical limitation in Spoken Language Models (SLMs) for low-resource languages by identifying a fundamental trade-off called the Stability-Expressivity Gap, where synthetic data improves phonetic accuracy but suppresses prosodic variability. The proposed self-alignment frameworks—DGSA and TDSC—recover expressivity while maintaining stability, achieving performance comparable to commercial systems and enabling zero-shot voice cloning for Lao.

🧠 Gemini

AIBullisharXiv – CS AI · May 287/10

🧠

BioELX: Cross-lingual Biomedical Entity Linking via Alias-based Retrieval and LLM Ranking

Researchers introduce BioELX, a two-stage cross-lingual biomedical entity linking system that maps medical mentions across languages to knowledge base identifiers without requiring task-specific training data. The framework combines multilingual alias-enriched retrieval with LLM-based ranking, achieving state-of-the-art results across five benchmarks with substantial improvements for low-resource languages.

AIBullisharXiv – CS AI · Apr 157/10

🧠

AdaMCoT: Rethinking Cross-Lingual Factual Reasoning through Adaptive Multilingual Chain-of-Thought

Researchers introduce AdaMCoT, a framework that improves multilingual reasoning in large language models by dynamically routing intermediate thoughts through optimal 'thinking languages' before generating target-language responses. The approach achieves significant performance gains in low-resource languages without requiring additional pretraining, addressing a key limitation in current multilingual AI systems.

AIBullisharXiv – CS AI · Mar 56/10

🧠

Multimodal Large Language Models for Low-Resource Languages: A Case Study for Basque

Researchers successfully developed multimodal large language models for Basque, a low-resource language, finding that only 20% Basque training data is needed for solid performance. The study demonstrates that specialized Basque language backbones aren't required, potentially enabling MLLM development for other underrepresented languages.

🧠 Llama

AINeutralarXiv – CS AI · Jun 256/10

🧠

SARA: Unlocking Multilingual Knowledge in Mixture-of-Experts via Semantically Anchored Routing Alignment

Researchers introduce SARA, a framework that improves multilingual performance in Mixture-of-Experts language models by aligning routing patterns between low-resource and high-resource languages. The method uses semantic anchoring and Jensen-Shannon divergence constraints to enable better expert sharing across languages, demonstrating measurable improvements on benchmark tests.

AIBullisharXiv – CS AI · Jun 256/10

🧠

Error-Aware TF-IDF Retrieval-Augmented Generation for ASR Error Correction

Researchers propose an error-aware TF-IDF retrieval-augmented generation framework that corrects automatic speech recognition (ASR) errors by using phonetically-aware lexical matching rather than heavy cross-modal embeddings. The method achieved a 37.2 percentage-point improvement in error-aware hit rate and reduced word error rate by 4.23 points on Persian speech data with minimal computational overhead.

AIBullisharXiv – CS AI · Jun 256/10

🧠

Neural Machine Translation for Low-Resource Tangkhul--English

Researchers have developed a neural machine translation system for Tangkhul, a severely under-resourced Tibeto-Burman language spoken in Manipur, India, achieving a BLEU score of 39.97 using a fine-tuned ByT5-large model trained on 38,336 parallel sentences. This work addresses a significant gap in NLP infrastructure for one of India's marginalized linguistic communities and demonstrates practical approaches to machine translation for languages with minimal computational resources.

AINeutralarXiv – CS AI · Jun 236/10

🧠

WASIL: In-the-Wild Arabic Spoken Interactions with LLMs

Researchers released WASIL, a dataset of 8,529 Arabic spoken interactions with LLMs including audio, transcriptions, and user feedback, to address how speech recognition errors degrade voice assistant performance. The dataset includes a 2,000-turn test set covering Modern Standard Arabic and four dialects, with annotations distinguishing between genuine unanswerability and ASR-induced failures, enabling more accurate evaluation of voice AI systems.

AIBullisharXiv – CS AI · Jun 236/10

🧠

From Speech to Text Corpora: Evaluating ASR-Based Data Acquisition for Low-Resource Fongbe and Hausa

Researchers successfully fine-tuned automatic speech recognition (ASR) models to create text corpora for low-resource African languages Fongbe and Hausa, achieving significant improvements in transcription accuracy. The work demonstrates ASR's potential for rapidly expanding language resources in underrepresented languages, though quality varies by linguistic complexity, with Hausa transcriptions approaching production-ready standards while Fongbe requires further refinement.

AINeutralarXiv – CS AI · Jun 236/10

🧠

Evaluating Large Language Models for Hausa and Fongbe Machine Translation: Benchmarks, Failures, and Metric Reliability

Researchers evaluated four major LLMs (GPT-4o Mini, Claude Sonnet 4, Gemini 2.5 Flash, Qwen2.5-7B) on English-to-Hausa and English-to-Fongbe translation, finding that translation quality varies dramatically by language, model rankings differ across languages, and automatic evaluation metrics show weak correlation with human judgment for low-resource African languages.

🧠 GPT-4🧠 Claude🧠 Sonnet

AIBullisharXiv – CS AI · Jun 236/10

🧠

Cross-lingual Retrieval-Augmented Classification for Dysarthria Severity Assessment

Researchers propose Cross-lingual Retrieval-Augmented Classification (CRAC), an AI method that improves dysarthria severity assessment by leveraging speech data from different languages to overcome the scarcity of labeled pathological speech datasets. The approach achieves significant accuracy improvements on Korean and Italian datasets, demonstrating the potential of cross-lingual transfer learning in medical speech analysis.

AINeutralarXiv – CS AI · Jun 235/10

🧠

Transcribing Bengali Text with Regional Dialects to IPA using District Guided Tokens

Researchers have developed a District Guided Tokens (DGT) technique to improve Bengali text-to-IPA transcription by incorporating regional dialect information, with the ByT5 model achieving superior performance on a new dataset spanning six Bangladeshi districts. This advancement addresses the phonological complexity of Bengali dialects and demonstrates the importance of regional context in natural language processing systems.

AIBullisharXiv – CS AI · Jun 116/10

🧠

Pretrained self-supervised speech models can recognize unseen consonants

Researchers demonstrate that pretrained self-supervised speech models (Wav2Vec2 and HuBERT) can accurately recognize click consonants from low-resource Khoisan languages despite training data heavily skewed toward high-resource languages. Fine-tuning on click-rich language data reveals these models generalize better to rare phonemes than expected, suggesting self-supervision creates robust representations across diverse human speech sounds.

AINeutralarXiv – CS AI · Jun 106/10

🧠

Towards Robust Arabic Speech Emotion Recognition with Deep Learning

Researchers propose a CNN-Transformer hybrid architecture for Arabic Speech Emotion Recognition that achieves 98.1% accuracy, outperforming CNN-LSTM and fine-tuned wav2vec 2.0 models. The study addresses the underexplored challenge of emotion detection in Arabic speech by combining convolutional feature extraction with Transformer-based context modeling, demonstrating effectiveness in low-resource, dialectally diverse settings.

AIBullisharXiv – CS AI · Jun 96/10

🧠

Retrieval Augmented Generation Framework for the Nepali Legal Domain Question Answering

Researchers have successfully developed the first Retrieval Augmented Generation (RAG) system for legal question answering in Nepali, addressing a critical gap in AI applications for low-resource languages. The system achieved 91% precision using BM25 retrieval and demonstrated 84% human-evaluated truthfulness, establishing a viable foundation for AI-assisted legal services in non-English speaking jurisdictions.

AINeutralarXiv – CS AI · Jun 96/10

🧠

GlobeAudio: A Multilingual Multicultural Benchmark for Naturalistic Evaluation of Large Audio-Language Models

GlobeAudio, a new benchmark dataset, evaluates Large Audio-Language Models across six languages using 5,637 naturally-sourced audio questions. The research reveals significant performance gaps in current LALMs, particularly for open-source models and low-resource languages, highlighting critical limitations in how audio-language AI systems handle real-world acoustic conditions.

🏢 Hugging Face

AINeutralarXiv – CS AI · Jun 96/10

🧠

Data Synthesis and Parameter-Efficient Fine-Tuning for Low-Resource NMT: A Case Study on Q'eqchi' Mayan

Researchers developed a data synthesis methodology for neural machine translation of Q'eqchi' Mayan, using synthetic corpora derived from community dictionaries and Parameter-Efficient Fine-Tuning to avoid extractive web-scraping. While the approach achieved strong structural performance (BLEU 42.02 on synthetic data), it revealed a critical gap: the model excels at learning grammar but fails to acquire authentic semantic grounding (BLEU 0.59 on organic text), suggesting synthetic bootstrapping alone cannot replace real-world linguistic diversity.

AINeutralarXiv – CS AI · Jun 56/10

🧠

Exploring LLMs for South Asian Music Understanding and Generation

Researchers conducted the first systematic evaluation of Large Language Models on South Asian classical music understanding and generation, finding that frontier models like Gemini 2.5 Pro achieve 85-90% accuracy on music comprehension but struggle with stylistically faithful generation (40% success rate). The study reveals that current LLMs handle Western musical traditions far better than structurally distinct, low-resource traditions like Hindustani and Bengali classical music.

🧠 Gemini

AIBullisharXiv – CS AI · Jun 56/10

🧠

Multilingual Coreference Resolution via Cycle-Consistent Machine Translation

Researchers propose a novel coreference resolution pipeline that uses machine translation and cycle-consistency validation to improve NLP performance in low-resource languages. By translating English training data to target languages and back-translating to verify quality, the approach generates weighted training samples that significantly enhance coreference resolution accuracy, even enabling resolution in languages without existing corpora.

AINeutralarXiv – CS AI · Jun 26/10

🧠

BenHalluEval: A Multi-Task Hallucination Evaluation Framework for Large Language Models on Bengali

Researchers introduce BenHalluEval, the first hallucination evaluation framework for Bengali-language LLMs, covering four task categories with 12,000 test cases across seven models. The framework reveals significant performance gaps and demonstrates that standard evaluation metrics fail to capture hallucination risks in low-resource languages.

🧠 GPT-5

AINeutralarXiv – CS AI · Jun 25/10

🧠

Enhancing BiGRU with a KAN Block for Legal Document Classification and Summarization

Researchers have developed a novel neural architecture combining Kolmogorov-Arnold Networks (KAN) with BiGRU models for classifying and summarizing legal documents in multilingual, low-resource settings. Tested on Bengali, English, and transliterated Bengali legal documents from Bangladesh, the hybrid model achieved 67.96% classification accuracy while demonstrating that KAN integration improved performance by over 10 percentage points.

Page 1 of 3Next →