AIBullisharXiv – CS AI · 7h ago6/10
🧠
Improving Visual Representation Alignment Generation with GRPO
Researchers propose VRPO, a reinforcement learning-based optimization method that improves training efficiency in diffusion transformers by dynamically aligning generative and discriminative representations. The approach replaces static alignment losses with adaptive reward-based optimization, achieving up to 1.8 FID improvement and 2.3x faster training compared to existing methods.