y0news
AnalyticsDigestsSourcesRSSAICrypto
#sparse-computing1 article
1 articles
AIBullisharXiv โ€“ CS AI ยท 4h ago7/10
๐Ÿง 

SLaB: Sparse-Lowrank-Binary Decomposition for Efficient Large Language Models

Researchers propose SLaB, a novel framework for compressing large language models by decomposing weight matrices into sparse, low-rank, and binary components. The method achieves significant improvements over existing compression techniques, reducing perplexity by up to 36% at 50% compression rates without requiring model retraining.

๐Ÿข Perplexity๐Ÿง  Llama