AINeutralHugging Face Blog · Feb 25/108
🧠
NPHardEval Leaderboard: Unveiling the Reasoning Abilities of Large Language Models through Complexity Classes and Dynamic Updates
NPHardEval Leaderboard introduces a new evaluation framework for assessing large language models' reasoning capabilities through computational complexity classes with dynamic updates. The leaderboard aims to provide more rigorous testing of LLM reasoning abilities by incorporating problems from different complexity categories.