AIBullisharXiv – CS AI · 18h ago6/10
🧠
LEAP: Learnable End-to-End Adaptive Pruning of Large Language Models
Researchers introduce LEAP, a new technique for pruning large language models that uses learnable per-weight masks to achieve better accuracy than existing layer-wise methods, particularly at aggressive sparsity levels. The approach replaces earlier intractable parameterization methods with a Bernoulli-via-Gumbel-sigmoid relaxation, demonstrating 2.59 points average improvement over ADMM across multiple LLM families.