#catastrophic-forgetting News & Analysis

56 articles tagged with #catastrophic-forgetting. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

56 articles

AINeutralarXiv – CS AI · Jun 117/10

🧠

Federated continual learning: A comprehensive survey on lifelong and privacy-preserving learning over distributed and non-stationary data

A comprehensive survey examines Federated Continual Learning (FCL), which combines federated learning's privacy-preserving distributed training with continual learning's ability to adapt to evolving data. The research addresses a critical gap in current FL systems that assume static data, proposing frameworks for real-world applications like healthcare and IoT where data streams continuously shift, causing performance degradation and catastrophic forgetting.

AIBullisharXiv – CS AI · Jun 97/10

🧠

Not Just After One: Sleep-Inspired Replay Prevents Catastrophic Forgetting After Sequential Tasks

Researchers demonstrate that artificial neural networks can mitigate catastrophic forgetting—the tendency to lose previously learned information when training on new tasks—by applying unsupervised replay mechanisms after sequential learning periods, mimicking biological sleep-based memory consolidation. This approach defers interference correction until after multiple new tasks are learned, suggesting a more efficient pathway for developing continual learning AI systems.

AIBullisharXiv – CS AI · Jun 27/10

🧠

RAFT: Data Refinement and Adaptive Distillation for Domain Fine-Tuning with Alleviated Forgetting

Researchers introduce RAFT, a framework addressing the problem of catastrophic forgetting in domain-specific fine-tuning of language models. By combining data refinement with answer-conditioned distillation, RAFT achieves 23.2% improvement in domain accuracy while recovering 10-18% of general capability losses typically incurred during fine-tuning.

AIBullisharXiv – CS AI · May 297/10

🧠

Overcoming Forgetting in LLM Fine-Tuning with Evolution Strategies

Researchers demonstrate that Evolution Strategies (ES) can effectively fine-tune large language models without catastrophic forgetting of prior tasks, contrary to recent concerns. By introducing Anchored Weight Decay (AWD), a regularization technique that constrains optimization toward initial parameters, the work shows ES-based continual learning is viable and computationally efficient compared to reinforcement learning approaches.

AIBullisharXiv – CS AI · May 97/10

🧠

Emergent Slow Thinking in LLMs as Inverse Tree Freezing

Researchers present a statistical-physics framework explaining how large language models develop multi-step reasoning through reinforcement learning with verifiable rewards (RLVR), modeling the process as inverse tree freezing in a concept network. They propose Annealed-RLVR, a timing-optimized training method that outperforms standard RLVR by applying supervised fine-tuning at peak frustration rather than after convergence, preventing policy collapse.

AIBullisharXiv – CS AI · May 77/10

🧠

Memory as a Markov Matrix: Sample Efficient Knowledge Expansion via Token-to-Dictionary Mapping

Researchers propose a novel framework that models language model memory as a Markov transition matrix, enabling efficient incorporation of new knowledge without catastrophic forgetting. The approach requires only linear sample complexity in the number of existing tokens and achieves zero forgetting through minimal parameter updates via an embedding-tuning algorithm.

AIBullisharXiv – CS AI · May 77/10

🧠

Stabilizing LLM Supervised Fine-Tuning via Explicit Distributional Control

Researchers propose Anchored Learning, a new fine-tuning method that prevents catastrophic forgetting in large language models by controlling distributional drift through a dynamically evolving reference anchor. The technique achieves near-optimal performance gains while reducing degradation from over 53% to under 5% on benchmark tasks.

AIBullisharXiv – CS AI · May 77/10

🧠

Skill Neologisms: Towards Skill-based Continual Learning

Researchers propose skill neologisms—soft tokens added to LLM vocabularies—as a scalable approach to continual learning that enables models to acquire new capabilities without catastrophic forgetting or weight updates. The method demonstrates that independently trained skill tokens can compose zero-shot and work with out-of-distribution tasks, offering a practical alternative to fine-tuning.

AIBullisharXiv – CS AI · Apr 147/10

🧠

Beyond LLMs, Sparse Distributed Memory, and Neuromorphics <A Hyper-Dimensional SRAM-CAM "VaCoAl" for Ultra-High Speed, Ultra-Low Power, and Low Cost>

Researchers propose VaCoAl, a hyperdimensional computing architecture that combines sparse distributed memory with Galois-field algebra to address limitations in modern AI systems like catastrophic forgetting and the binding problem. The deterministic system demonstrates emergent properties equivalent to spike-timing-dependent plasticity and achieves multi-hop reasoning across 25.5M paths in knowledge graphs, positioning it as a complementary third paradigm to large language models.

AIBullisharXiv – CS AI · Apr 147/10

🧠

Persistent Identity in AI Agents: A Multi-Anchor Architecture for Resilient Memory and Continuity

Researchers introduce soul.py, an open-source architecture addressing catastrophic forgetting in AI agents by distributing identity across multiple memory systems rather than centralizing it. The framework implements persistent identity through separable components and a hybrid RAG+RLM retrieval system, drawing inspiration from how human memory survives neurological damage.

AINeutralarXiv – CS AI · Apr 107/10

🧠

Information as Structural Alignment: A Dynamical Theory of Continual Learning

Researchers introduce the Informational Buildup Framework (IBF), a new approach to continual learning that eliminates catastrophic forgetting by treating information as structural alignment rather than stored parameters. The framework demonstrates superior performance across multiple domains including chess and image classification, achieving near-zero forgetting without requiring raw data replay.

AIBullisharXiv – CS AI · Mar 177/10

🧠

SCAN: Sparse Circuit Anchor Interpretable Neuron for Lifelong Knowledge Editing

Researchers introduce SCAN, a new framework for editing Large Language Models that prevents catastrophic forgetting during sequential knowledge updates. The method uses sparse circuit manipulation instead of dense parameter changes, maintaining model performance even after 3,000 sequential edits across major models like Gemma2, Qwen3, and Llama3.1.

🧠 Llama

AIBullisharXiv – CS AI · Mar 56/10

🧠

Pretrained Vision-Language-Action Models are Surprisingly Resistant to Forgetting in Continual Learning

Researchers discovered that pretrained Vision-Language-Action (VLA) models demonstrate remarkable resistance to catastrophic forgetting in continual learning scenarios, unlike smaller models trained from scratch. Simple Experience Replay techniques achieve near-zero forgetting with minimal replay data, suggesting large-scale pretraining fundamentally changes continual learning dynamics for robotics applications.

AIBullisharXiv – CS AI · Mar 47/103

🧠

The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward

Researchers have identified a critical flaw in reinforcement learning fine-tuning of large language models that causes degradation in multi-attempt performance despite improvements in single attempts. Their proposed solution, Diversity-Preserving Hybrid RL (DPH-RL), uses mass-covering f-divergences to maintain model diversity and prevent catastrophic forgetting while improving training efficiency.

AIBullisharXiv – CS AI · Mar 46/103

🧠

cPNN: Continuous Progressive Neural Networks for Evolving Streaming Time Series

Researchers developed cPNN (Continuous Progressive Neural Networks), a new AI architecture that handles evolving data streams with temporal dependencies while avoiding catastrophic forgetting. The system addresses concept drift in time series data by combining recurrent neural networks with progressive learning techniques, showing quick adaptation to new concepts.

AIBullisharXiv – CS AI · Mar 37/103

🧠

Dream2Learn: Structured Generative Dreaming for Continual Learning

Researchers introduce Dream2Learn (D2L), a continual learning framework that enables AI models to generate synthetic training data from their own internal representations, mimicking human dreaming for knowledge consolidation. The system creates novel 'dreamed classes' using diffusion models to improve forward knowledge transfer and prevent catastrophic forgetting in neural networks.

AIBullisharXiv – CS AI · Feb 277/106

🧠

Knowledge Fusion of Large Language Models Via Modular SkillPacks

Researchers introduce GraftLLM, a new method for transferring knowledge between large language models using 'SkillPack' format that preserves capabilities while avoiding catastrophic forgetting. The approach enables efficient model fusion and continual learning for heterogeneous models through modular knowledge storage.

AIBullisharXiv – CS AI · Jun 236/10

🧠

Attention-Spectrum Regularization for Replay-Free Continual Multimodal LLMs

Researchers propose Attention-Spectrum Regularization (ASR), a new continual learning framework for multimodal large language models that prevents catastrophic forgetting when adapting to new visual domains and tasks without replaying past data. ASR preserves cross-modal attention patterns by storing compact spectral statistics rather than actual training examples, demonstrating improved performance on vision-language benchmarks.