AINeutralarXiv – CS AI · 6h ago7/10
🧠
Natural Ungrokking: Asymmetric Control of Which Rules Survive Pretraining
Researchers discovered that language models forget learned rules midway through training despite continued evidence in data—a phenomenon called 'natural ungrokking.' The survival of rules depends predictably on how often they appear in training data, and attempts to restore forgotten rules through data manipulation fail despite successfully destroying them, revealing asymmetric control over model knowledge.