AINeutralarXiv – CS AI · Mar 175/10
🧠
Align Forward, Adapt Backward: Closing the Discretization Gap in Logic Gate Networks
Researchers propose CAGE (Confidence-Adaptive Gradient Estimation) to solve the training-inference mismatch problem in neural networks that use soft mixtures during training but hard selection during inference. The method achieves over 98% accuracy on MNIST with zero selection gap, significantly outperforming existing approaches like Gumbel-ST which suffers accuracy collapse.