Certified Circuits: Stability Guarantees for Mechanistic Circuits
Researchers introduce Certified Circuits, a framework that provides provable stability guarantees for neural network circuit discovery. The method wraps existing algorithms with randomized data subsampling to ensure circuit components remain consistent across dataset variations, achieving 91% higher accuracy while using 45% fewer neurons.