AINeutralarXiv – CS AI · 14h ago6/10
🧠
Robust and Generalizable Safety Steering for Text-to-Image Diffusion Transformers
Researchers introduce SafeDIG, a safety steering framework designed to make text-to-image diffusion transformers like FLUX.1 and Stable Diffusion 3.5 resistant to generating harmful content. The method uses sparse autoencoders and adaptive decoding to maintain safety controls across different risk domains while preserving image quality.
🧠 Stable Diffusion