AINeutralarXiv – CS AI · 9h ago6/10
🧠
Learning What to Forget: Improving LLM Unlearning via Learned Token-Level Importance
Researchers introduce Alternating Token-Weighted Unlearning (ATWU), a new method for removing specific knowledge from language models while maintaining their general capabilities. The approach identifies which tokens are most relevant for forgetting by measuring conflict with model retention objectives, achieving state-of-the-art results without requiring external supervision or auxiliary models.