AINeutralarXiv – CS AI · 6h ago6/10
🧠
Distribution Preference Optimization: A Fine-grained Perspective for LLM Unlearning
Researchers introduce DiPO (Distribution Preference Optimization), a novel algorithm for LLM unlearning that operates at the token distribution level rather than full response level. The method addresses limitations in existing approaches like NPO by constructing preference signals through selective amplification of model logits, achieving superior performance on benchmark tests while maintaining model utility.