y0news
AnalyticsDigestsSourcesRSSAICrypto
#low-rank-decomposition1 article
1 articles
AIBullisharXiv โ€“ CS AI ยท 5h ago7/10
๐Ÿง 

SoLA: Leveraging Soft Activation Sparsity and Low-Rank Decomposition for Large Language Model Compression

Researchers propose SoLA, a training-free compression method for large language models that combines soft activation sparsity and low-rank decomposition. The method achieves significant compression while improving performance, demonstrating 30% compression on LLaMA-2-70B with reduced perplexity from 6.95 to 4.44 and 10% better downstream task accuracy.

๐Ÿข Perplexity