AINeutralarXiv – CS AI · 9h ago7/10
🧠
Semantic Integrity Matters: Benchmarking and Preserving High-Density Reasoning in KV Cache Compression
Researchers introduce KVFundaBench to expose a critical gap in KV cache compression evaluation: while retrieval tasks remain robust under compression, reasoning tasks degrade severely due to disrupted Chain-of-Thought coherence. They propose ShotKV, which preserves semantic integrity by treating few-shot examples as indivisible units, achieving 9-18% accuracy improvements on long-context tasks while reducing latency by 11%.