Side-by-side Comparison Amplifies Dialect Bias in Language Models
Researchers demonstrate that language models exhibit significantly amplified dialect bias when comparing intent-equivalent tweets in Standard American English versus African-American Vernacular English side-by-side, rather than in isolation. This bias persists despite commercial safety alignment efforts and worsens with explicit dialect labels, suggesting current evaluation methods underestimate real-world harm in ranking and decision-making contexts.
