AIBearisharXiv โ CS AI ยท 8h ago6/10
๐ง
MolQuest: A Benchmark for Agentic Evaluation of Abductive Reasoning in Chemical Structure Elucidation
Researchers introduce MolQuest, a new benchmark for evaluating AI models' ability to perform complex chemical structure elucidation through multi-step reasoning. Even state-of-the-art AI models achieve only 50% accuracy on this real-world scientific task, revealing significant limitations in current AI systems' strategic reasoning capabilities.