AIBullisharXiv โ CS AI ยท 2d ago7/10
๐ง
Explainable LLM Unlearning Through Reasoning
Researchers introduce Targeted Reasoning Unlearning (TRU), a new method for removing specific knowledge from large language models while preserving general capabilities. The approach uses reasoning-based targets to guide the unlearning process, addressing issues with previous gradient ascent methods that caused unintended capability degradation.