y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#risk-quantification News & Analysis

1 article tagged with #risk-quantification. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AINeutralarXiv – CS AI · 6h ago6/10
🧠

From Parameter Dynamics to Risk Scoring : Quantifying Sample-Level Safety Degradation in LLM Fine-tuning

Researchers have identified a critical vulnerability in LLM safety alignment where fine-tuning on benign samples causes parameters to drift toward unsafe behaviors, erasing safety gains from millions of preference examples. The study proposes SQSD, a method to quantify and score individual training samples by their contribution to safety degradation, with demonstrated transferability across different model architectures and scales.