y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#inference-time-defense News & Analysis

1 article tagged with #inference-time-defense. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AINeutralarXiv – CS AI · 7h ago6/10
🧠

ReasoningGuard: Safeguarding Large Reasoning Models with Inference-time Safety Aha Moments

Researchers introduce ReasoningGuard, an inference-time safety mechanism designed to protect Large Reasoning Models from generating harmful content during their reasoning processes. The method uses internal attention mechanisms to inject safety-oriented reflections at critical points, mitigating jailbreak attacks without requiring costly fine-tuning and outperforming nine existing safeguards.