AIBullisharXiv โ CS AI ยท 5d ago7/104
๐ง
Bridging Draft Policy Misalignment: Group Tree Optimization for Speculative Decoding
Researchers introduce Group Tree Optimization (GTO), a new training method that improves speculative decoding for large language models by aligning draft model training with actual decoding policies. GTO achieves 7.4% better acceptance length and 7.7% additional speedup over existing state-of-the-art methods across multiple benchmarks and LLMs.