AINeutralarXiv – CS AI · 18h ago6/10
🧠
Closure-Validated Circuit Discovery in Attention Heads: Co-activation Proposes, Ablation Disposes
Researchers propose a methodology for validating attention-head circuits in large language models by combining co-activation clustering with causal ablation testing. Their findings reveal that while clustering signals identify circuit proposals, true circuit validation requires closure tests that measure functional impact through ablation—a distinction that challenges current interpretability approaches.