AIBullisharXiv – CS AI · 7h ago7/10
🧠
LC-QAT: Data-Efficient 2-Bit QAT for LLMs via Linear-Constrained Vector Quantization
Researchers introduce LC-QAT, a novel 2-bit quantization method for large language models that combines vector quantization with learnable affine mappings to achieve superior compression with minimal training data. The approach outperforms existing quantization-aware training methods while requiring only 0.1-10% of typical training data, advancing the practical deployment of extremely low-bit LLMs.