AINeutralarXiv – CS AI · 3h ago6/10
🧠
Semantic Optimal Transport for Sparse Autoencoder Feature Matching and Circuit Compression
Researchers introduce a novel semantic distance metric for sparse autoencoders (SAEs) using distributional representations and Wasserstein distance, enabling better cross-layer feature matching and automatic circuit compression in language model interpretability research.