AINeutralarXiv – CS AI · 7h ago6/10
🧠
Diffusion-based Cumulative Adversarial Purification for Vision Language Models
Researchers present DiffCAP, a diffusion-based defense mechanism that protects Vision Language Models from adversarial attacks by injecting noise and using similarity thresholds to purify corrupted inputs before inference. The method demonstrates superior performance across multiple datasets and VLM architectures while reducing computational overhead compared to existing defense techniques.