AIBullisharXiv โ CS AI ยท 5h ago
๐ง
Dissecting Quantization Error: A Concentration-Alignment Perspective
Researchers introduce Concentration-Alignment Transforms (CAT), a new method to reduce quantization error in large language and vision models by improving both weight/activation concentration and alignment. The technique consistently matches or outperforms existing quantization methods at 4-bit precision across several LLMs.