AIBullisharXiv – CS AI · 6h ago7/10
🧠
Trainable Smooth-Rotation Transforms with Learned Channel Scales for LLM Quantization
Researchers propose improved post-training quantization techniques for large language models using quantile-robust scaling policies and learned channel scales, demonstrating 18.5% error reduction on LLaMA-3.2-1B under W4A4 quantization. The work addresses activation quantization challenges caused by outlier-dominated channels, offering practical efficiency improvements for LLM deployment without requiring full model retraining.