AIBearisharXiv โ CS AI ยท 4h ago6/10
๐ง
LLMs Struggle with Abstract Meaning Comprehension More Than Expected
Research shows that large language models like GPT-4o struggle significantly with abstract meaning comprehension across zero-shot, one-shot, and few-shot settings, while fine-tuned models like BERT and RoBERTa perform better. A bidirectional attention classifier inspired by human cognitive strategies improved accuracy by 3-4% on abstract reasoning tasks, revealing a critical gap in how modern LLMs handle non-concrete, high-level semantics.
๐ง GPT-4