AINeutralarXiv – CS AI · Mar 117/10
🧠
OOD-MMSafe: Advancing MLLM Safety from Harmful Intent to Hidden Consequences
Researchers introduce OOD-MMSafe, a new benchmark revealing that current Multimodal Large Language Models fail to identify hidden safety risks up to 67.5% of the time. They developed CASPO framework which dramatically reduces failure rates to under 8% for risk identification in consequence-driven safety scenarios.