AIBullisharXiv – CS AI · 14h ago7/10
🧠
HARP: Hadamard-Preconditioned Adaptive Rotation Processor for Extreme LLM Quantization
Researchers introduce HARP, a learnable adaptive rotation processor that improves extreme low-bit quantization for large language models by replacing fixed Hadamard transforms with optimizable structured orthogonal processors. The technique maintains full-precision equivalence while achieving better perplexity and accuracy across 2-4 bit quantization settings on models up to 70B parameters, with deployment speeds competitive with standard approaches.
🏢 Perplexity