AIBearisharXiv โ CS AI ยท 4h ago6/10
๐ง
Evaluating the Formal Reasoning Capabilities of Large Language Models through Chomsky Hierarchy
Researchers introduced ChomskyBench, a new benchmark for evaluating large language models' formal reasoning capabilities using the Chomsky Hierarchy framework. The study reveals that while larger models show improvements, current LLMs face severe efficiency barriers and are significantly less efficient than traditional algorithmic programs for formal reasoning tasks.