AINeutralarXiv โ CS AI ยท 5h ago1
๐ง
Joint Training Across Multiple Activation Sparsity Regimes
Researchers propose a novel neural network training strategy that cycles models through multiple activation sparsity regimes using global top-k constraints. Preliminary experiments on CIFAR-10 show this approach outperforms dense baseline training, suggesting joint training across sparse and dense activation patterns may improve generalization.