AIBullisharXiv – CS AI · 6h ago6/10
🧠
Self-Distillation Policy Optimization via Visual Feedback: Bridging Code and Visual Artifacts
Researchers introduce Visual-SDPO, a self-distillation framework that enables code-generating LLMs to improve visual artifact quality by learning from rendered output feedback. The method achieves 10+ point improvements on code-to-visual generation benchmarks while maintaining inference efficiency.