AIBearisharXiv – CS AI · 10h ago7/10
🧠
Explanation Fairness in Large Language Models: An Empirical Analysis of Disparities in How LLMs Justify Decisions Across Demographic Groups
Researchers have identified systematic fairness disparities in how large language models explain their decisions across demographic groups, introducing the Explanation Fairness Taxonomy (EFT) to measure five dimensions of explanation inequality. Testing five major LLMs across hiring, medical, credit, and legal domains reveals statistically significant disparities in explanation quality, with stylistic inequalities appearing resistant to prompt-based fixes and likely embedded in model pre-training.
🧠 GPT-4🧠 Claude