AIBullisharXiv โ CS AI ยท 5h ago7/10
๐ง
Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices
Researchers introduce Vec-LUT, a novel vector-based lookup table technique that dramatically improves ultra-low-bit LLM inference on edge devices by addressing memory bandwidth underutilization. The method achieves up to 4.2x performance improvements over existing approaches, enabling faster LLM execution on CPUs than specialized NPUs.