AIBullisharXiv โ CS AI ยท 5d ago6/104
๐ง
AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size
Researchers introduce AdaBlock-dLLM, a training-free optimization technique for diffusion-based large language models that adaptively adjusts block sizes during inference based on semantic structure. The method addresses limitations in conventional fixed-block semi-autoregressive decoding, achieving up to 5.3% accuracy improvements under the same throughput budget.