y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#rate-distortion News & Analysis

1 article tagged with #rate-distortion. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AIBullisharXiv – CS AI · 10h ago7/10
🧠

RDKV: Rate-Distortion Bit Allocation for Joint Eviction and Quantization of the KV Cache

Researchers propose RDKV, a novel compression technique that jointly optimizes eviction and quantization of the Key-Value cache in large language models to reduce memory bottlenecks during inference. The method achieves 4.5x decode speedup and 1.9x peak memory reduction on 128K context lengths while maintaining 97.81% accuracy, addressing a critical performance constraint in LLM deployment.