y0news
AnalyticsDigestsSourcesRSSAICrypto
#seam1 article
1 articles
AIBullisharXiv โ€“ CS AI ยท 5d ago7/105
๐Ÿง 

Self-Destructive Language Model

Researchers introduce SEAM, a novel defense mechanism that makes large language models 'self-destructive' when adversaries attempt harmful fine-tuning attacks. The system allows models to function normally for legitimate tasks but causes catastrophic performance degradation when fine-tuned on harmful data, creating robust protection against malicious modifications.