y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#poisoning-attacks News & Analysis

3 articles tagged with #poisoning-attacks. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

3 articles
AIBearisharXiv โ€“ CS AI ยท Apr 147/10
๐Ÿง 

Backdoors in RLVR: Jailbreak Backdoors in LLMs From Verifiable Reward

Researchers have discovered a critical vulnerability in Reinforcement Learning with Verifiable Rewards (RLVR), an emerging training paradigm that enhances LLM reasoning abilities. By injecting less than 2% poisoned data into training sets, attackers can implant backdoors that degrade safety performance by 73% when triggered, without modifying the reward verifier itself.

AIBearisharXiv โ€“ CS AI ยท Mar 47/102
๐Ÿง 

Silent Sabotage During Fine-Tuning: Few-Shot Rationale Poisoning of Compact Medical LLMs

Researchers discovered a new stealth poisoning attack method targeting medical AI language models during fine-tuning that degrades performance on specific medical topics without detection. The attack injects poisoned rationales into training data, proving more effective than traditional backdoor attacks or catastrophic forgetting methods.

AIBearisharXiv โ€“ CS AI ยท Mar 37/104
๐Ÿง 

Stealthy Poisoning Attacks Bypass Defenses in Regression Settings

Researchers have developed new stealthy poisoning attacks that can bypass current defenses in regression models used across industrial and scientific applications. The study introduces BayesClean, a novel defense mechanism that better protects against these sophisticated attacks when poisoning attempts are significant.