y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#ai-bias News & Analysis

47 articles tagged with #ai-bias. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

47 articles
AIBearisharXiv – CS AI · 2d ago7/10
🧠

Persona Conditioning of Brand Recommendations in Retrieval-Augmented Commercial Chat: A Prominence-Stratified Cross-Provider Audit

A comprehensive audit of three major AI models reveals that personalized user contexts significantly reshape brand recommendations in commercial AI assistants, with mid-market brands experiencing up to 75% recommendation volatility while category leaders maintain 80% consistency across personas. The study demonstrates that AI recommendation bias is strongly correlated with model architecture and retrieval strategies, with implications for fair evaluation and brand perception measurement.

🏢 OpenAI🏢 Anthropic
AINeutralarXiv – CS AI · 3d ago7/10
🧠

MIRA: A Bilingual Benchmark for Medical Information Response Audit

Researchers introduced MIRA, a bilingual benchmark testing whether large language models provide consistent medical information across different user phrasings, health literacy levels, and languages. The study revealed that LLMs systematically omit key medical details when responding to low-health-literacy queries, a pattern termed Differential Information Dilution (DID), with implications for equitable health information access.

🧠 Claude
AINeutralarXiv – CS AI · 3d ago7/10
🧠

Benchmarking Fairness in Spiking Neural Networks: Data Bias, Spurious Features, and Hardware Effects

Researchers introduce the first systematic fairness benchmark for Spiking Neural Networks (SNNs), revealing that biased training data causes 23% higher false positive rates for underrepresented groups, while hardware constraints amplify accuracy gaps by up to 41% in edge deployments. The study demonstrates that existing bias mitigation strategies fail under resource constraints, establishing the need for co-designed approaches that balance fairness with hardware efficiency.

AIBearisharXiv – CS AI · 3d ago7/10
🧠

Auditing medical multi-agent AI reveals risks of false consensus

Researchers introduced MedAgentAudit, a framework that reveals critical safety failures in medical multi-agent AI systems, finding that collaborative AI architectures frequently exhibit unsupported observations, evidence avoidance, and decision-making biases rather than genuine reasoning. The study across 14,400 cases and six AI architectures demonstrates that consensus-based medical AI systems are unreliable for clinical use without fundamental process-level improvements.

AIBearisharXiv – CS AI · 3d ago7/10
🧠

Examining Agents' Bias Amplification versus Suppression in Multi-Agent Systems

Researchers demonstrate that biases in multi-agent AI systems can amplify at the system level rather than cancel out, with uniformly biased agents producing fairness degradation exceeding the sum of individual biases. The study introduces Favor Bias Strength (FBS), a metric to measure bias alteration, and reveals critical vulnerabilities in fairness preservation across deployed multi-agent systems.

AIBearisharXiv – CS AI · May 117/10
🧠

LLM hallucinations in the wild: Large-scale evidence from non-existent citations

Researchers auditing 2.5 million scientific papers found 146,932 hallucinated citations in 2025 alone, with non-existent references surging sharply after LLM adoption. The errors concentrate in AI-heavy fields and papers with linguistic signatures of AI assistance, while current journal moderation fails to catch most instances, threatening scientific integrity and reinforcing existing biases in academic credit attribution.

AIBearisharXiv – CS AI · Apr 147/10
🧠

Edu-MMBias: A Three-Tier Multimodal Benchmark for Auditing Social Bias in Vision-Language Models under Educational Contexts

Researchers present Edu-MMBias, a comprehensive framework for detecting social biases in Vision-Language Models used in educational settings. The study reveals that VLMs exhibit compensatory class bias while harboring persistent health and racial stereotypes, and critically, that visual inputs bypass text-based safety mechanisms to trigger hidden biases.

AIBearisharXiv – CS AI · Apr 147/10
🧠

Demographic and Linguistic Bias Evaluation in Omnimodal Language Models

Researchers evaluated four omnimodal AI models across text, image, audio, and video processing, finding substantial demographic and linguistic biases particularly in audio understanding tasks. The study reveals significant accuracy disparities across age, gender, language, and skin tone, with audio tasks showing prediction collapse toward narrow categories, highlighting fairness concerns as these models see wider real-world deployment.

AIBearisharXiv – CS AI · Apr 147/10
🧠

Cross-Cultural Value Awareness in Large Vision-Language Models

Researchers have conducted a comprehensive study examining how large vision-language models (LVLMs) exhibit cultural stereotypes and biases when making judgments about people's moral, ethical, and political values based on cultural context cues in images. Using counterfactual image sets and Moral Foundations Theory, the analysis across five popular LVLMs reveals significant concerns about AI fairness beyond traditional social biases, with implications for deployed AI systems used globally.

AINeutralarXiv – CS AI · Apr 67/10
🧠

Mitigating LLM biases toward spurious social contexts using direct preference optimization

Researchers developed Debiasing-DPO, a new training method that reduces harmful biases in large language models by 84% while improving accuracy by 52%. The study found that LLMs can shift predictions by up to 1.48 points when exposed to irrelevant contextual information like demographics, highlighting critical risks for high-stakes AI applications.

🧠 Llama
AIBearisharXiv – CS AI · Mar 177/10
🧠

Large Language Models Reproduce Racial Stereotypes When Used for Text Annotation

A comprehensive study of 19 large language models reveals systematic racial bias in automated text annotation, with over 4 million judgments showing LLMs consistently reproduce harmful stereotypes based on names and dialect. The research demonstrates that AI models rate texts with Black-associated names as more aggressive and those written in African American Vernacular English as less professional and more toxic.

AIBearisharXiv – CS AI · Mar 177/10
🧠

Widespread Gender and Pronoun Bias in Moral Judgments Across LLMs

A comprehensive study of six major LLM families reveals systematic biases in moral judgments based on gender pronouns and grammatical markers. The research found that AI models consistently favor non-binary subjects while penalizing male subjects in fairness assessments, raising concerns about embedded biases in AI ethical decision-making.

🏢 Meta🧠 Grok
AINeutralarXiv – CS AI · Mar 56/10
🧠

Automated Concept Discovery for LLM-as-a-Judge Preference Analysis

Researchers developed automated methods to discover biases in Large Language Models when used as judges, analyzing over 27,000 paired responses. The study found LLMs exhibit systematic biases including preference for refusing sensitive requests more than humans, favoring concrete and empathetic responses, and showing bias against certain legal guidance.

AIBearisharXiv – CS AI · Mar 56/10
🧠

Baseline Performance of AI Tools in Classifying Cognitive Demand of Mathematical Tasks

A research study tested 11 AI tools on their ability to classify the cognitive demand of mathematical tasks, finding they achieved only 63% accuracy on average with no tool exceeding 83%. The tools showed systematic bias toward middle-category classifications and struggled with reasoning about underlying cognitive processes versus surface textual features.

🏢 Perplexity🧠 ChatGPT🧠 Claude
AINeutralarXiv – CS AI · Mar 56/10
🧠

Order Is Not Layout: Order-to-Space Bias in Image Generation

Researchers have identified Order-to-Space Bias (OTS) in modern image generation models, where the order entities are mentioned in text prompts incorrectly determines spatial layout and role assignments. The study introduces OTS-Bench to measure this bias and demonstrates that targeted fine-tuning and early-stage interventions can reduce the problem while maintaining generation quality.

AIBearishMIT News – AI · Feb 197/104
🧠

Study: AI chatbots provide less-accurate information to vulnerable users

MIT research reveals that leading AI chatbots deliver less accurate information to vulnerable user groups, including those with lower English proficiency, less formal education, and non-US backgrounds. The study highlights concerning disparities in AI performance that could exacerbate existing inequalities in access to reliable information.

AINeutralarXiv – CS AI · 3d ago6/10
🧠

OccuReward: LLM-Guided Occupant-Centric Reward Shaping for Demographic Equity in Grid-Interactive Buildings

Researchers introduce OccuReward, an LLM-guided framework that shapes reward functions for AI-controlled building energy systems to promote demographic equity in occupant comfort. Testing with four occupant profiles reveals significant disparities in initial AI performance, with elderly female occupants experiencing lowest satisfaction, though targeted refinement achieved dramatic improvements (567% for elderly females) while reducing energy costs by 3.2%.

🧠 Gemini
AINeutralarXiv – CS AI · 3d ago6/10
🧠

Whose Name Comes Up? III: Persona Prompting Effects in LLM-Based Scholar Recommendation

Researchers benchmarked 43 large language models used for academic scholar recommendations, revealing that prompt design significantly affects recommendation quality and diversity. The study found that model choice, persona prompting (language, location, role), and context variables independently shape which scholars are recommended, with geographic location prompts producing the most variation in factuality and representativeness across disciplines.

AIBearisharXiv – CS AI · 4d ago6/10
🧠

Generative artificial intelligence and the marginalization of minoritized knowledges in higher education: the case of disability

A new research paper examines how generative AI systems in higher education perpetuate marginalization of non-Western epistemologies and disability perspectives due to Western-centric training data. The study argues that AI's claim to neutrality masks its active role in reinforcing epistemic coloniality, with persons with disabilities experiencing particular exclusion from both AI design processes and knowledge validation systems.

AIBearishDecrypt – AI · 4d ago6/10
🧠

AI Chatbots Show Bias Toward Catholicism, Researchers Say

Researchers have identified systematic bias in AI chatbots that steer users toward Catholicism while steering them away from religions like Jehovah's Witnesses. This finding raises concerns about the neutrality and fairness of widely-used AI systems in handling sensitive topics like religion.

AI Chatbots Show Bias Toward Catholicism, Researchers Say
AINeutralarXiv – CS AI · May 126/10
🧠

Playing games with knowledge: AI-Induced delusions need game theoretic interventions

Researchers propose that conversational AI systems create epistemic problems not through flawed models but through game-theoretic dynamics where sycophantic responses reinforce user biases. They introduce an "Epistemic Mediator" mechanism with belief versioning to break feedback loops that lead users toward delusional certainty, achieving 48x reduction in belief spirals.

AINeutralarXiv – CS AI · May 116/10
🧠

CyBiasBench: Benchmarking Bias in LLM Agents for Cyber-Attack Scenarios

Researchers introduce CyBiasBench, a benchmark revealing that LLM agents deployed for cybersecurity attacks exhibit inherent biases toward specific attack families regardless of prompting. The study demonstrates agents resist steering away from their preferred attack patterns, suggesting these biases are fundamental agent characteristics rather than prompt-dependent behaviors.

AIBearishFortune Crypto · May 106/10
🧠

AI generated identical résumés for a man and a woman: Hers was more likely to be labeled ‘weak,’ while his got a 97% approval rating

A study revealed that identical résumés generated by AI received dramatically different evaluations based on the applicant's perceived gender, with a woman's résumé labeled 'weak' while an identical man's résumé achieved a 97% approval rating. This finding highlights gender bias in AI evaluation systems and suggests that fear of harsher judgment may discourage people from adopting AI tools.

AI generated identical résumés for a man and a woman: Hers was more likely to be labeled ‘weak,’ while his got a 97% approval rating
AINeutralarXiv – CS AI · May 16/10
🧠

People-Centred Medical Image Analysis

Researchers propose PecMan, a human-AI framework designed to optimize fairness, accuracy, and clinical workflow integration simultaneously in medical image analysis. The framework addresses the gap between high-performing AI diagnostic systems and their limited real-world adoption by balancing performance across diverse patient populations while respecting clinician workload constraints.

🏢 Meta
Page 1 of 2Next →