#safety-steering News & Analysis

2 articles tagged with #safety-steering. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles

AIBullisharXiv – CS AI · Mar 46/104

🧠

Conditioned Activation Transport for T2I Safety Steering

Researchers introduce Conditioned Activation Transport (CAT), a new framework to prevent text-to-image AI models from generating unsafe content while preserving image quality for legitimate prompts. The method uses a geometry-based conditioning mechanism and nonlinear transport maps, validated on Z-Image and Infinity architectures with significantly reduced attack success rates.

AINeutralarXiv – CS AI · May 296/10

🧠

Robust and Generalizable Safety Steering for Text-to-Image Diffusion Transformers

Researchers introduce SafeDIG, a safety steering framework designed to make text-to-image diffusion transformers like FLUX.1 and Stable Diffusion 3.5 resistant to generating harmful content. The method uses sparse autoencoders and adaptive decoding to maintain safety controls across different risk domains while preserving image quality.

🧠 Stable Diffusion