AIBullisharXiv โ CS AI ยท 5d ago6/103
๐ง
Latent Diffusion Model without Variational Autoencoder
Researchers introduce SVG, a new latent diffusion model that eliminates the need for variational autoencoders by using self-supervised representations. The approach leverages frozen DINO features to create semantically structured latent spaces, enabling faster training, fewer sampling steps, and better generative quality while maintaining semantic capabilities.