AIBearisharXiv โ CS AI ยท 4h ago6/10
๐ง
From Abstract to Contextual: What LLMs Still Cannot Do in Mathematics
A new study reveals that large language models, despite excelling at benchmark math problems, struggle significantly with contextual mathematical reasoning where problems are embedded in real-world scenarios. The research shows performance drops of 13-34 points for open-source models and 13-20 points for proprietary models when abstract math problems are presented in contextual settings.