y0news
AnalyticsDigestsSourcesRSSAICrypto
#llm-quantization1 article
1 articles
AIBullisharXiv โ€“ CS AI ยท 7h ago7/10
๐Ÿง 

SpecQuant: Spectral Decomposition and Adaptive Truncation for Ultra-Low-Bit LLMs Quantization

SpecQuant introduces a novel quantization framework using spectral decomposition to compress large language models to 4-bit precision for both weights and activations, achieving only 1.5% accuracy loss on LLaMA-3 8B while enabling 2x faster inference and 3x memory reduction. The technique exploits frequency domain properties to preserve essential signal components while suppressing high-frequency noise, addressing a critical challenge in deploying LLMs on edge devices.