AIBullisharXiv โ CS AI ยท 5h ago7/10
๐ง
Make Your LVLM KV Cache More Lightweight
Researchers propose LightKV, a technique that reduces Key-Value cache memory overhead in Large Vision-Language Models by compressing vision tokens using cross-modality message passing guided by text prompts. The method achieves 50% reduction in KV cache size while using only 55% of original vision tokens and reducing computation by up to 40%, maintaining performance across eight benchmark datasets.