AIBullisharXiv – CS AI · 9h ago7/10
🧠
LLMCodec: Adapting Video Codecs for Efficient Weight Compression of Large Language Models
Researchers introduce LLMCodec, a novel compression method that adapts video codecs like VVC/H.266 to efficiently compress large language models. The approach achieves significant improvements over existing quantization methods, reducing perplexity by 1.5x on LLaMA-3-8B at 2-bit precision while improving downstream task accuracy by 21%.
🏢 Perplexity