Lookahead Sample Reward Guidance for Test-Time Scaling of Diffusion Models
Researchers present LiDAR, a test-time scaling method for diffusion models that improves sample quality alignment with human intent using efficient reward guidance. The approach achieves comparable performance to existing gradient guidance methods while delivering 9.5x faster sampling speeds by computing expected future rewards from marginal samples without neural backpropagation.