AIBullisharXiv โ CS AI ยท 8h ago6/10
๐ง
Self-Corrected Image Generation with Explainable Latent Rewards
Researchers introduce xLARD, a self-correcting framework for text-to-image generation that uses multimodal large language models to provide explainable feedback and improve alignment with complex prompts. The system employs a lightweight corrector that refines latent representations based on structured feedback, addressing challenges in generating images that match fine-grained semantics and spatial relations.