AINeutralarXiv โ CS AI ยท 14h ago6/10
๐ง
Diffusion-CAM: Faithful Visual Explanations for dMLLMs
Researchers introduce Diffusion-CAM, a novel interpretability method designed specifically for diffusion-based Multimodal Large Language Models (dMLLMs). Unlike existing visualization techniques optimized for sequential models, this approach accounts for the parallel denoising process inherent to diffusion architectures, achieving superior localization accuracy and visual fidelity in model explanations.