Repeated Shared Access Enables Grokking, but Edit Propagation Depends on a Fine-Grained Addressable Memory
Researchers compare four neural network architectures for factual knowledge propagation in question-answering systems, finding that repeated shared memory access enables out-of-distribution generalization ('grokking'), but only architectures with fine-grained addressable memory can effectively propagate edited facts. The study dissociates learning capability from editing affordance, revealing that looped computation and explicit memory mechanisms serve different functional purposes.
This research addresses a fundamental challenge in machine learning: how neural networks learn compositional knowledge and how those learned facts can be modified. The study introduces a controlled experimental framework using synthetic knowledge-graph QA tasks to isolate architectural factors affecting two distinct capabilities: generalization and editability.
The key finding separates two previously conflated properties. While repeated shared access—whether through loop recurrence or explicit memory rereading—enables models to achieve out-of-distribution generalization, only architectures with explicit, fine-grained addressable memory (Dense+Mem and LMC) successfully propagate factual edits. Memory-equipped models achieve 71-96% edit propagation compared to 0-30% for loop-based approaches, a statistically significant difference. This dissociation matters because it reveals that architectural choices optimizing for one capability may not optimize for another.
For the broader AI field, these results inform architecture design for systems requiring both strong generalization and controlled fact modification—critical for applications like knowledge base maintenance, model alignment, and continual learning. The mechanistic insights about where edited facts localize within network computations provide a foundation for developing more interpretable and editable models.
The research has implications for developers building AI systems requiring fact updates without full retraining. Understanding that memory mechanisms and computation flow interact to determine editability enables more intentional architectural choices. Future work should explore whether these findings scale to realistic models and knowledge domains, and whether insights about fact localization transfer to larger, more complex systems.
- →Repeated shared access enables out-of-distribution generalization in neural networks, but doesn't guarantee efficient fact editing
- →Fine-grained addressable memory is necessary for propagating single-fact edits with high specificity and success rates
- →Memory-based architectures achieve 71-96% edit propagation versus 0-30% for non-memory architectures on factual edits
- →The timing of when edited facts are injected and how much computation reuses them determines propagation effectiveness
- →Learning competence and editing affordance are separable properties requiring different architectural features