AIBearisharXiv โ CS AI ยท 6h ago2
๐ง
CaptionFool: Universal Image Captioning Model Attacks
Researchers have developed CaptionFool, a universal adversarial attack that can manipulate AI image captioning models by modifying just 1.2% of image patches. The attack achieves 94-96% success rates in forcing models to generate arbitrary captions, including offensive content that can bypass content moderation systems.