AIBullisharXiv – CS AI · 15h ago7/10
🧠
HiSpec: Hierarchical Speculative Decoding for LLMs
Researchers introduce HiSpec, a hierarchical speculative decoding framework that accelerates large language model inference by using early-exit models for intermediate verification, achieving up to 2.01× throughput improvements without sacrificing accuracy.