y0news
← Feed
←Back to feed
🧠 AIπŸ”΄ BearishImportance 6/10

Which English Do LLMs Prefer? Triangulating Structural Bias Towards American English in Foundation Models

arXiv – CS AI|Mir Tafseer Nayeem, Davood Rafiei|
πŸ€–AI Summary

A new research study reveals that major large language models exhibit systematic bias toward American English over British English across training data, tokenization, and outputs. The research introduces DiAlign, a method for measuring dialectal alignment, and finds evidence of linguistic homogenization that could impact global AI equity.

Key Takeaways
  • β†’Six major pretraining corpora show systematic skew toward American English over British English varieties.
  • β†’LLM tokenizers impose higher segmentation costs on British English forms compared to American English.
  • β†’Generative AI models consistently prefer American English in their outputs despite global English diversity.
  • β†’The study introduces DiAlign, a training-free method for measuring dialectal bias in language models.
  • β†’Researchers warn of linguistic homogenization and epistemic injustice in global AI deployment.
Read Original β†’via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains β€” you keep full control of your keys.
Connect Wallet to AI β†’How it works
Related Articles