y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#safety-calibration News & Analysis

1 article tagged with #safety-calibration. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AIBullisharXiv – CS AI · Apr 147/10
🧠

Risk Awareness Injection: Calibrating Vision-Language Models for Safety without Compromising Utility

Researchers propose Risk Awareness Injection (RAI), a lightweight, training-free framework that enhances vision-language models' ability to recognize unsafe content by amplifying risk signals in their feature space. The method maintains model utility while significantly reducing vulnerability to multimodal jailbreak attacks, addressing a critical security gap in VLMs.