AINeutralarXiv โ CS AI ยท 7h ago7/10
๐ง
Invisible Influences: Investigating Implicit Intersectional Biases through Persona Engineering in Large Language Models
Researchers introduced BADx, a novel metric that measures how Large Language Models amplify implicit biases when adopting different social personas, revealing that popular LLMs like GPT-4o and DeepSeek-R1 exhibit significant context-dependent bias shifts. The study across five state-of-the-art models demonstrates that static bias testing methods fail to capture dynamic bias amplification, with implications for AI safety and responsible deployment.
๐ง GPT-4๐ง Claude