AIBullisharXiv – CS AI · 3h ago6/10
🧠
EvoSpec: Evolving Speculative Decoding via Real-Time Vocabulary and Parameter AdaptationTarget
EvoSpec introduces a dynamic framework for accelerating Large Language Model inference through real-time adaptation of vocabulary and parameters in speculative decoding. By addressing the vocabulary bottleneck that causes performance degradation in specialized domains, EvoSpec achieves 1.13x speedup improvements over static baselines while reducing memory overhead by 27%.