y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#alignment-risk News & Analysis

1 article tagged with #alignment-risk. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AIBearisharXiv โ€“ CS AI ยท 4h ago7/10
๐Ÿง 

Narrative over Numbers: The Identifiable Victim Effect and its Amplification Under Alignment and Reasoning in Large Language Models

Researchers tested whether large language models exhibit the Identifiable Victim Effect (IVE)โ€”a well-documented cognitive bias where people prioritize helping a specific individual over a larger group facing equal hardship. Across 51,955 API trials spanning 16 frontier models, instruction-tuned LLMs showed amplified IVE compared to humans, while reasoning-specialized models inverted the effect, raising critical concerns about AI deployment in humanitarian decision-making.

๐Ÿข OpenAI๐Ÿข Anthropic๐Ÿข xAI