#knowledge-editing News & Analysis

13 articles tagged with #knowledge-editing. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

13 articles

AIBearisharXiv – CS AI · Jun 237/10

🧠

Exposing the Illusion of Erasure in Knowledge Editing for LLMs

A new research paper reveals critical vulnerabilities in Knowledge Editing (KE) techniques used to update facts in Large Language Models without retraining. The study demonstrates that edited knowledge is not truly erased but merely suppressed, and can be recovered through adversarial prompting, exposing fundamental flaws in current post-hoc update methods.

AIBearisharXiv – CS AI · May 127/10

🧠

Benchmarking Safety Risks of Knowledge-Intensive Reasoning under Malicious Knowledge Editing

Researchers introduce EditRisk-Bench, a new benchmark for evaluating safety vulnerabilities in large language models when their knowledge is maliciously edited. The study demonstrates that adversaries can inject false or harmful information that corrupts downstream reasoning while remaining difficult to detect, revealing critical security gaps in knowledge-intensive AI systems.

AINeutralarXiv – CS AI · May 77/10

🧠

Automatically Finding and Validating Unexpected Side-Effects of Interventions on Language Models

Researchers present an automated pipeline for auditing behavioral changes in large language models when interventions are applied. The method generates human-readable hypotheses about model differences and validates them statistically, successfully identifying both intended and unexpected side-effects across real-world interventions like knowledge editing and unlearning.

AIBullisharXiv – CS AI · Mar 177/10

🧠

SCAN: Sparse Circuit Anchor Interpretable Neuron for Lifelong Knowledge Editing

Researchers introduce SCAN, a new framework for editing Large Language Models that prevents catastrophic forgetting during sequential knowledge updates. The method uses sparse circuit manipulation instead of dense parameter changes, maintaining model performance even after 3,000 sequential edits across major models like Gemma2, Qwen3, and Llama3.1.

🧠 Llama

AINeutralarXiv – CS AI · Jun 236/10

🧠

Orthogonal Representation Editing: Decoupling Semantic Entanglement in Batch Knowledge Editing of LLMs

Researchers propose Orthogonal Representation Editing (ORE), a novel method for efficiently updating factual knowledge in Large Language Models without full retraining. The technique addresses a critical limitation in batch knowledge editing by decoupling semantic representation entanglement through orthogonal constraints, demonstrating superior performance including cross-lingual capabilities.

AINeutralarXiv – CS AI · Jun 196/10

🧠

LOKI: Memory-Free Null-Space Constrained Lifelong Knowledge Editing

LOKI is a new method for lifelong knowledge editing in language models that dynamically selects which layers to update and avoids catastrophic forgetting without requiring access to previous training data. The approach achieves up to 14% improvement in accuracy over existing methods by using the Hilbert-Schmidt Independence Criterion and null-space projection techniques.

AINeutralarXiv – CS AI · Jun 106/10

🧠

Benchmarking Knowledge Editing using Logical Rules

Researchers introduce a new benchmark for evaluating knowledge editing in Large Language Models that tests logical consequences of edits, not just direct fact insertion. Current methods like ROME and FT show up to 24% performance gaps between edited facts and their logical implications, revealing a critical weakness in how LLMs handle knowledge consistency.

AINeutralarXiv – CS AI · Jun 46/10

🧠

ZeroUnlearn: Few-Shot Knowledge Unlearning in Large Language Models

Researchers introduce ZeroUnlearn, a novel machine unlearning framework that efficiently removes sensitive information from large language models through knowledge re-mapping and representational orthogonality, rather than expensive retraining. The method preserves overall model utility while selectively unlearning harmful data in few-shot settings, addressing critical privacy and safety concerns in LLMs.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Revisiting Ripple Effects in Knowledge Editing through Pressure-Aware Joint Neighborhood Optimization

Researchers propose Joint Neighborhood Optimization (JNO), a new framework for knowledge editing in large language models that simultaneously manages desired information propagation and prevents unintended disruption to related facts. The method uses Pressure-Aware Coordination to jointly optimize coupled constraints and achieves 7% improvement in both propagation and preservation metrics across different model architectures.

$XRP

AINeutralarXiv – CS AI · May 296/10

🧠

Towards Localized and Disentangled Knowledge Editing for Multimodal Large Language Models

Researchers propose LDKE, a new framework for editing knowledge in Multimodal Large Language Models that addresses two critical failure modes: causal misalignment (edits confined to specific samples) and feature entanglement (unintended alterations to related information). The method uses localized layer identification and input disentanglement to enable precise, generalized edits while preserving unrelated knowledge.

AINeutralarXiv – CS AI · May 286/10

🧠

From Fact Overwriting to Knowledge Evolution: Causal Editing via On-Policy Self-Distillation

Researchers present CODE, a novel approach to knowledge editing in large language models that replaces fact overwriting with causal reasoning. By embedding causal narratives and on-policy distillation into model parameters, CODE reduces self-refutation rates from 95.6% to 1.8%, enabling LLMs to evolve knowledge coherently rather than storing isolated facts.

AIBullisharXiv – CS AI · Mar 166/10

🧠

MetaKE: Meta-learning Aligned Knowledge Editing via Bi-level Optimization

Researchers propose MetaKE, a new framework for knowledge editing in Large Language Models that addresses the 'Semantic-Execution Disconnect' through bi-level optimization. The method treats edit targets as learnable parameters and uses a Structural Gradient Proxy to align edits with the model's feasible manifold, showing significant improvements over existing approaches.

AINeutralarXiv – CS AI · Mar 175/10

🧠

SAKE: Towards Editing Auditory Attribute Knowledge of Large Audio-Language Models

Researchers introduce SAKE, the first benchmark for editing auditory attribute knowledge in large audio-language models without requiring full retraining. The study reveals significant limitations in current editing methods, particularly with auditory generalization and sequential editing, while finding that fine-tuning modality connectors offers better performance than editing LLM backbones directly.