AINeutralarXiv – CS AI · 14h ago6/10
🧠
Theoretical Analysis of Sparse Optimization with Reparameterization, Weight Decay, and Adaptive Learning Rate
Researchers introduce ReWA, a novel sparse optimization method combining reparameterization, weight decay, and adaptive learning rates to address instability issues in ℓp regularization. Experiments on CIFAR-10 and ImageNet demonstrate that ReWA achieves superior sparsity compared to ℓ1 regularization while maintaining test accuracy, offering a practical alternative for neural network compression.