AIBullisharXiv โ CS AI ยท Mar 37/104
๐ง
Bridging Draft Policy Misalignment: Group Tree Optimization for Speculative Decoding
Researchers introduce Group Tree Optimization (GTO), a new training method that improves speculative decoding for large language models by aligning draft model training with actual decoding policies. GTO achieves 7.4% better acceptance length and 7.7% additional speedup over existing state-of-the-art methods across multiple benchmarks and LLMs.