y0news
#policy-violations1 article
1 articles
AINeutralarXiv โ€“ CS AI ยท 6h ago2
๐Ÿง 

GMP: A Benchmark for Content Moderation under Co-occurring Violations and Dynamic Rules

Researchers introduce GMP, a new benchmark highlighting critical challenges in AI content moderation systems when dealing with co-occurring policy violations and dynamic platform rules. The study reveals that current large language models struggle with consistent moderation when policies are unstable or context-dependent, leading to either over-censorship or allowing harmful content.