#language-bias News & Analysis

6 articles tagged with #language-bias. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

6 articles

AIBearisharXiv – CS AI · Jun 57/10

🧠

When Surface Form Changes Moderation Decisions: A Paired Study of Code-Mixed Workflow Instability

Researchers found that content moderation systems trained on clean English perform significantly worse when processing code-mixed inputs (mixing English and Tamil), causing a 26.5% decision flip rate between allowing and flagging identical content. The study reveals workflow-level failures in moderation systems, including increased false positives on non-hateful content and higher review burdens, issues missed by standard classification metrics.

AIBearisharXiv – CS AI · Jun 27/10

🧠

TukaBench: A Culturally Grounded Jailbreak Benchmark for African Languages

Researchers introduce TukaBench, a jailbreak safety benchmark for seven African languages that reveals LLMs are significantly more vulnerable to adversarial prompts when queried in African languages versus English, with culturally adapted prompts proving most effective at bypassing safety measures. The study identifies critical gaps in LLM safety evaluation for low-resource languages and demonstrates that existing judging mechanisms fail to accurately assess model responses in these languages.

🧠 GPT-5

AIBearisharXiv – CS AI · May 277/10

🧠

Seeing vs. Believing: Evaluating the Language Bias of Open-Source MLLMs in Counter-Intuitive Scenes

Researchers introduced CAIT, a benchmark testing multimodal large language models' ability to understand counter-intuitive visual scenes that contradict common sense. The study reveals that open-source MLLMs fail dramatically at these tasks due to language bias, automatically overriding visual evidence with statistically common text patterns, while proprietary models like Claude and Gemini demonstrate robust performance.

🧠 Claude🧠 Gemini

AINeutralarXiv – CS AI · Jun 116/10

🧠

Neural FOXP2 -- Language Specific Neuron Steering for Targeted Language Improvement in LLMs

Researchers introduce Neural FOXP2, a technique that identifies and steers language-specific neurons in large language models to shift their default behavior from English to other languages like Hindi or Spanish. The method uses sparse autoencoders and spectral analysis to isolate a compact set of control circuits governing language preference, enabling safer, more targeted manipulation of multilingual model behavior.

AINeutralarXiv – CS AI · May 286/10

🧠

DEPART: DEcomposing PARiTy across Multilingual LLMs

Researchers introduce DEPART, a Bayesian framework that systematically decomposes performance disparities across multilingual large language models into interpretable components. The study reveals that language features and representational similarity to English explain 79-92% of variance, with model identity dominating NLU tasks while benchmark-model interactions drive reasoning task differences.

AINeutralarXiv – CS AI · Apr 76/10

🧠

Multilingual Prompt Localization for Agent-as-a-Judge: Language and Backbone Sensitivity in Requirement-Level Evaluation

A research study reveals that AI model performance rankings change dramatically based on the evaluation language used, with GPT-4o performing best in English while Gemini leads in Arabic and Hindi. The study tested 55 development tasks across five languages and six AI models, showing no single model dominates across all languages.

🧠 GPT-4🧠 Gemini