AIBullisharXiv – CS AI · 5h ago6/10
🧠
Predict-then-Diffuse: Adaptive Response Length for Compute-Budgeted Inference in Diffusion LLMs
Researchers propose Predict-then-Diffuse, a framework that optimizes diffusion-based large language models by predicting required response length before generation, reducing computational waste from padding tokens and re-computation overhead while maintaining output quality across multiple datasets.