AIBullisharXiv – CS AI · 18h ago7/10
🧠
Vision Language Model Helps Private Information De-Identification in Vision Data
Researchers introduce VisShield, a privacy-enhancing framework for Vision Language Models that uses specialized instruction-tuning and the OPTIC dataset to detect and mask sensitive information like Protected Health Information in images. The approach combines OCR-focused prompts with tailored training to enable VLMs to recognize privacy-sensitive text and output precise bounding boxes for effective de-identification.