9 articles tagged with #bias-detection. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AINeutralarXiv โ CS AI ยท Mar 177/10
๐ง Researchers have introduced FAIRGAME, a new framework that uses game theory to identify biases in AI agent interactions. The tool enables systematic discovery of biased outcomes in multi-agent scenarios based on different Large Language Models, languages used, and agent characteristics.
AINeutralarXiv โ CS AI ยท Mar 97/10
๐ง Researchers introduce AdAEM, a new evaluation algorithm that automatically generates test questions to better assess value differences and biases across Large Language Models. Unlike static benchmarks, AdAEM adaptively creates controversial topics that reveal more distinguishable insights about LLMs' underlying values and cultural alignment.
AIBullishMIT News โ AI ยท Feb 197/104
๐ง MIT researchers have developed a new method to identify and expose hidden biases, moods, personalities, and abstract concepts within large language models. This breakthrough could help address LLM vulnerabilities and enhance both safety and performance of AI systems.
AINeutralarXiv โ CS AI ยท 2d ago6/10
๐ง Researchers introduce GLEaN, a visual explainability method that transforms complex AI bias detection into understandable portrait composites, enabling non-technical audiences to grasp how text-to-image models like Stable Diffusion XL associate occupations and identities with specific demographic characteristics.
๐ง Stable Diffusion
AINeutralarXiv โ CS AI ยท Mar 166/10
๐ง Researchers have launched LLM BiasScope, an open-source web application that enables real-time bias analysis and side-by-side comparison of outputs from major language models including Google Gemini, DeepSeek, and Meta Llama. The platform uses a two-stage bias detection pipeline and provides interactive visualizations to help researchers and practitioners evaluate bias patterns across different AI models.
๐ข Hugging Face๐ง Gemini๐ง Llama
AINeutralarXiv โ CS AI ยท Mar 37/107
๐ง Researchers identify a critical flaw in Vision-Language Model evaluation for radiology, where high benchmark scores mask models' failure to generate clinically specific terminology. They propose new metrics including Clinical Association Displacement (CAD) to measure bias and clinical signal loss across demographic groups.
AINeutralarXiv โ CS AI ยท Mar 27/1019
๐ง Researchers have developed an automated pipeline to detect hidden biases in Large Language Models that don't appear in their reasoning explanations. The system discovered previously unknown biases like Spanish fluency and writing formality across seven LLMs in hiring, loan approval, and university admission tasks.
AINeutralarXiv โ CS AI ยท Mar 175/10
๐ง Researchers introduce Jacobian Scopes, a new gradient-based method for interpreting how individual tokens influence Large Language Model predictions. The technique uses perturbation theory and information geometry to reveal model biases, translation strategies, and learning mechanisms, with open-source implementations and an interactive demo available.
๐ข Hugging Face
AINeutralHugging Face Blog ยท Jun 265/104
๐ง The article discusses bias issues in text-to-image AI models, which is part of an Ethics and Society Newsletter series. Without the full article content, specific details about the types of bias and their implications cannot be determined.