AINeutralarXiv – CS AI · 6h ago6/10
🧠
Logit Distance Bounds Representational Similarity
Researchers demonstrate that logit distance—a measure based on differences in model predictions—better bounds representational similarity in neural networks than KL divergence does. The findings reveal that KL-based distillation can preserve predictive accuracy while failing to maintain the linear structure of internal representations, with implications for transfer learning and model compression.