AINeutralarXiv – CS AI · 9h ago7/10
🧠
The Self-Correction Illusion: LLMs Correct Others but Not Themselves
Researchers discovered that large language models refuse to correct their own reasoning errors but readily accept corrections when identical claims come from external sources like users or tools. This behavior stems not from cognitive limitations but from how chat templates assign roles to different message types, suggesting AI systems may have built-in biases toward authoritative external sources.