AIBullisharXiv – CS AI · 10h ago6/10
🧠
Training-Free Semantic Correction for Autoregressive Visual Models
Researchers present Gazer, a training-free framework that uses multimodal large language models to identify and correct semantic errors in autoregressive visual models during image and video generation. The approach operates through diagnostic and correction stages that analyze intermediate generation states and adjust trajectories without requiring additional model training.