When Context Misleads: Surprisal, Energy and Attention Entropy as Metrics of Coherence Illusions in LLMs
Researchers discovered that Dutch language models exhibit coherence illusions similar to humans, where incoherent text appears coherent when a matching distractor precedes it. Using surprisal, attention entropy, and energy metrics, they identified shared mechanisms underlying these illusions across different model architectures.