y0news
#image-captioning2 articles
2 articles
AIBearisharXiv โ€“ CS AI ยท 6h ago2
๐Ÿง 

CaptionFool: Universal Image Captioning Model Attacks

Researchers have developed CaptionFool, a universal adversarial attack that can manipulate AI image captioning models by modifying just 1.2% of image patches. The attack achieves 94-96% success rates in forcing models to generate arbitrary captions, including offensive content that can bypass content moderation systems.

AIBullisharXiv โ€“ CS AI ยท 6h ago0
๐Ÿง 

Cross-modal Identity Mapping: Minimizing Information Loss in Modality Conversion via Reinforcement Learning

Researchers developed Cross-modal Identity Mapping (CIM), a reinforcement learning framework that improves image captioning in Large Vision-Language Models by minimizing information loss during visual-to-text conversion. The method achieved 20% improvement in relation reasoning on the COCO-LN500 benchmark using Qwen2.5-VL-7B without requiring additional annotations.