Plausibility as Commonsense Reasoning: Humans Succeed, Large Language Models Do not
A new study reveals that large language models fail to integrate world knowledge with syntactic structure for ambiguity resolution in the same way humans do. Researchers tested Turkish language models on relative-clause attachment ambiguities and found that while humans reliably use plausibility to guide interpretation, LLMs show weak, unstable, or reversed responses to the same plausibility cues.