y0news
AnalyticsDigestsSourcesRSSAICrypto
#reasoning-abilities1 article
1 articles
AINeutralHugging Face Blog ยท Feb 25/108
๐Ÿง 

NPHardEval Leaderboard: Unveiling the Reasoning Abilities of Large Language Models through Complexity Classes and Dynamic Updates

NPHardEval Leaderboard introduces a new evaluation framework for assessing large language models' reasoning capabilities through computational complexity classes with dynamic updates. The leaderboard aims to provide more rigorous testing of LLM reasoning abilities by incorporating problems from different complexity categories.