y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#judge-aggregation News & Analysis

1 article tagged with #judge-aggregation. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AINeutralarXiv – CS AI · 14h ago6/10
🧠

Who can we trust? LLM-as-a-jury for Comparative Assessment

Researchers propose BT-sigma, a novel method for aggregating Large Language Model judgments in comparative evaluations that accounts for varying judge reliability without requiring human supervision. The approach significantly improves ranking accuracy compared to traditional averaging methods by modeling each LLM's discriminative capability as an unsupervised calibration mechanism.