AINeutralarXiv – CS AI · Apr 206/10
🧠
Protecting Language Models Against Unauthorized Distillation through Trace Rewriting
Researchers propose trace rewriting techniques to protect language models from unauthorized knowledge distillation, a process where smaller models learn from larger ones without permission. The methods preserve model accuracy while degrading distillation usefulness and embedding detectable watermarks in student models.