🧠 AI⚪ NeutralImportance 6/10

Fitting Multilinear Polynomials for Logic Gate Networks

arXiv – CS AI|Youngsung Kim|May 12, 2026 at 04:00 AM

🤖AI Summary

Researchers propose a novel approach to training learnable logic gate networks by representing 2-input Boolean gates as multilinear polynomials in 4-dimensional space, reducing a vector-quantization problem from 16 to 4 parameters per neuron. The CovJac method outperforms the baseline Soft-Mix approach, particularly at network depth, by addressing gradient starvation issues that cause performance collapse in deeper architectures.

Analysis

This research addresses a fundamental challenge in neural architecture design: training combinational circuits composed of stacked Boolean logic gates. The core insight is elegantly simple—since every 2-input Boolean gate corresponds to a unique multilinear polynomial with exactly 4 coefficients, the problem becomes one of vector quantization in a constrained 4-dimensional space rather than selecting from 16 discrete gate types.

The technical contribution centers on a gradient flow problem that emerges during training. The baseline Soft-Mix method applies a 16-dimensional softmax over gate identities, but the underlying codebook has inherent rank-4 structure. This geometric mismatch causes gradient starvation: 11 of the 15 simplex directions carry nullspace gradients, and crucially, the backward signal vanishes entirely at uniform initialization. The authors prove that no standard affine reparameterization can fix this under straight-through estimation (STE), establishing a fundamental limitation of naive approaches.

The proposed CovJac method leverages the covariance Jacobian of soft vector-quantization selection to couple starved interaction coefficients with the always-active constant channel, restoring gradient flow. This yields a 4x parameter reduction per neuron while maintaining or improving performance. Empirical validation across seven datasets shows consistent advantages, with the CovJac method proving particularly valuable in deep networks—Soft-Mix suffers a 37.3 percentage point drop on CIFAR-10 at 12 layers while CovJac remains stable with only 0.5 percentage point degradation.

The work demonstrates how understanding geometric structure in parameterization spaces can yield both theoretical insights and practical improvements. Deep learning practitioners working with constrained architectures should note that naive softmax selection over prototypes can catastrophically fail in depth unless the underlying parameter manifold structure is explicitly addressed.

Key Takeaways

→Boolean gate networks reduce to 4-dimensional multilinear polynomial fitting, enabling 4x parameter compression versus 16-dimensional soft selection
→Soft-Mix baseline suffers gradient starvation at initialization due to rank-4 codebook structure in 16-dimensional softmax space
→CovJac method bypasses starvation by coupling interaction coefficients to constant channels via covariance Jacobian
→CovJac maintains stability at depth (0.5pp drop at 12 layers) while Soft-Mix collapses (-37.3pp), indicating better scaling properties
→Geometric parameterization structure matters fundamentally—no affine reparameterization can fix STE-based starvation in this setting