AINeutralarXiv – CS AI · 9h ago6/10
🧠
PLOT: Progressive Localization via Optimal Transport in Neural Causal Abstraction
Researchers introduce PLOT (Progressive Localization via Optimal Transport), a new framework for mechanistic interpretability that efficiently identifies causal variables in neural networks through optimal transport coupling rather than computationally expensive searches. The method significantly speeds up causal abstraction analysis while maintaining competitive accuracy, offering practical advantages for large-scale AI interpretability research.