y0news
AnalyticsDigestsSourcesRSSAICrypto
#rater-bias1 article
1 articles
AINeutralarXiv โ€“ CS AI ยท Feb 276/104
๐Ÿง 

Correcting Human Labels for Rater Effects in AI Evaluation: An Item Response Theory Approach

Researchers propose using psychometric modeling to correct systematic biases in human evaluations of AI systems, demonstrating how Item Response Theory can separate true AI output quality from rater behavior inconsistencies. The approach was tested on OpenAI's summarization dataset and showed improved reliability in measuring AI model performance.