y0news
AnalyticsDigestsSourcesRSSAICrypto
#memory-reduction1 article
1 articles
AIBullisharXiv โ€“ CS AI ยท 5d ago7/104
๐Ÿง 

HEAPr: Hessian-based Efficient Atomic Expert Pruning in Output Space

Researchers introduce HEAPr, a novel pruning algorithm for Mixture-of-Experts (MoE) language models that decomposes experts into atomic components for more precise pruning. The method achieves nearly lossless compression at 20-25% pruning ratios while reducing computational costs by approximately 20%.