AINeutralarXiv โ CS AI ยท 7h ago4/10
๐ง
Finite Difference Flow Optimization for RL Post-Training of Text-to-Image Models
Researchers propose a new online reinforcement learning method for improving text-to-image diffusion models that reduces variance by comparing paired trajectories and treating the entire sampling process as a single action. The approach demonstrates faster convergence and better image quality and prompt alignment compared to existing methods.