y0news
AnalyticsDigestsSourcesRSSAICrypto
#frontier-llms1 article
1 articles
AINeutralarXiv โ€“ CS AI ยท 1d ago6/10
๐Ÿง 

Qworld: Question-Specific Evaluation Criteria for LLMs

Researchers introduce Qworld, a new method for evaluating large language models that generates question-specific criteria using recursive expansion trees instead of static rubrics. The approach covers 89% of expert-authored criteria and reveals capability differences across 11 frontier LLMs that traditional evaluation methods miss.