y0news
← Feed
Back to feed
🧠 AI🔴 BearishImportance 7/10

Implicit Geographic Inference in LLM Medical Triage: Language-Driven Disparities in Emergency Recommendations

arXiv – CS AI|Qi Han Wong|
🤖AI Summary

Researchers discovered that large language models produce dramatically different medical triage recommendations for identical symptoms based solely on the input language, with emergency room referral rates ranging from 0% to 30% across six languages despite consistent severity scores. The effect persists due to implicit geographic inference from language choice rather than translation quality, raising critical concerns about AI bias in healthcare systems.

Analysis

This study exposes a fundamental vulnerability in how language models make real-world decisions with life-or-death consequences. The researchers tested Gemini 3.5 Flash's medical triage capabilities across six languages using identical neurological symptoms, discovering that the model's recommendations varied by up to 30 percentage points—a massive swing when lives hang in the balance. The breakthrough finding is that adding geographic context (US location) to non-English prompts increased emergency recommendations by up to 76.7 percentage points, while conversely, anchoring an English prompt to Tokyo reduced recommendations from 30% to 6.7%. This demonstrates the model is making implicit geographic inferences based on language alone, suggesting it learned correlations between language, location, and healthcare-seeking behavior during training.

The implications extend far beyond this specific test case. Medical AI systems are increasingly deployed globally, yet they appear to encode hidden assumptions about where patients live and what healthcare resources they can access. This geographic bias could systematically disadvantage non-English speakers or reinforce disparities in emergency care access. The finding that back-translation from Japanese to English reproduced English-baseline results confirms the problem lies in the model's learned associations rather than translation accuracy.

For healthcare providers, AI developers, and regulators, this research highlights the urgent need for bias audits in deployed medical systems. The accessibility of language-driven disparities through simple API calls means such biases could affect millions of patients worldwide. Organizations relying on LLM-based triage systems should conduct similar multilingual testing before deployment, and developers should investigate whether other geographic or demographic biases exist in their models' decision-making processes.

Key Takeaways
  • LLMs produce 0-30% variation in emergency medical recommendations across languages for identical symptoms despite consistent severity scoring
  • Geographic bias is implicit in language choice; adding location context to non-English prompts increases ER recommendations by up to 76.7 percentage points
  • The disparity stems from learned model associations rather than translation quality, as confirmed by back-translation controls
  • AI-powered medical triage systems may systematically disadvantage non-English speakers if deployed without multilingual bias testing
  • Healthcare organizations should audit language-dependent biases in LLM systems before clinical deployment to prevent patient harm
Mentioned in AI
Models
GeminiGoogle
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles