y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#evaluation-transparency News & Analysis

1 article tagged with #evaluation-transparency. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AIBearisharXiv – CS AI · 18h ago7/10
🧠

Illusions of the Gold Standard: A Large-scale Analysis of Human Evaluation Protocols for Long-form Text Generation

Researchers conducted a large-scale analysis of human evaluation protocols across 284 *CL conference papers (2023-2025), discovering widespread under-reporting of critical study design details that undermine reproducibility. The findings reveal that transparency gaps in how text generation quality is assessed create ambiguity about measurement methodology, evaluator credentials, and result interpretation, prompting actionable recommendations for improved reporting standards.