AINeutralarXiv – CS AI · 9h ago6/10
🧠
Detecting Distillation Data from Reasoning Models
Researchers have developed Token Probability Deviation (TPD), a method to detect whether questions were included in a reasoning model's distillation training data. The technique addresses data contamination risks in reasoning distillation, where benchmark data may inadvertently inflate model performance metrics, achieving up to 31% improvement in detection accuracy.