y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#llm-unlearning News & Analysis

3 articles tagged with #llm-unlearning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

3 articles
AINeutralarXiv – CS AI · May 127/10
🧠

Unlearners Can Lie: Evaluating and Improving Honesty in LLM Unlearning

Researchers identify critical honesty failures in Large Language Model unlearning methods, where models hallucinate or behave inconsistently after attempting to forget harmful training data. They propose ReVa, a representation-alignment procedure that significantly improves model honesty by better acknowledging forgotten knowledge while maintaining utility on retained information.

AINeutralarXiv – CS AI · 5d ago6/10
🧠

De-attribute to Forget for LLM Unlearning

Researchers propose DareU, a novel LLM unlearning framework that uses data attribution rewards and reinforcement learning to remove training data influence from large language models. Unlike existing approaches that maximize loss on forget sets, this method reduces attribution scores to forgotten data owners, addressing critical issues of over-forgetting and model utility degradation.

AINeutralarXiv – CS AI · Apr 206/10
🧠

Harmonizing Multi-Objective LLM Unlearning via Unified Domain Representation and Bidirectional Logit Distillation

Researchers propose a multi-objective unlearning framework for Large Language Models that simultaneously removes hazardous information, preserves general utility, avoids over-refusal, and resists adversarial attacks. The method uses unified domain representation and bidirectional logit distillation to harmonize competing optimization goals, achieving state-of-the-art performance across diverse unlearning requirements.