y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#finetuning-attacks News & Analysis

1 article tagged with #finetuning-attacks. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AIBullisharXiv – CS AI · 18h ago7/10
🧠

Defending Against Malicious Finetuning by Scaling Train-time Adversarial Attacks

Researchers propose Patcher, a defense method against malicious finetuning attacks on open-weight large language models that uses scaled adversarial training to improve robustness. The technique strengthens model resilience against full-parameter finetuning attacks, which existing alignment defenses fail to prevent, with an efficient parallel implementation that maintains performance while reducing training time.