AINeutralarXiv โ CS AI ยท 7h ago6/10
๐ง
Protecting Language Models Against Unauthorized Distillation through Trace Rewriting
Researchers propose trace rewriting techniques to protect language models from unauthorized knowledge distillation, a process where smaller models learn from larger ones without permission. The methods preserve model accuracy while degrading distillation usefulness and embedding detectable watermarks in student models.