AINeutralarXiv โ CS AI ยท 3d ago7/10
๐ง
OOD-MMSafe: Advancing MLLM Safety from Harmful Intent to Hidden Consequences
Researchers introduce OOD-MMSafe, a new benchmark revealing that current Multimodal Large Language Models fail to identify hidden safety risks up to 67.5% of the time. They developed CASPO framework which dramatically reduces failure rates to under 8% for risk identification in consequence-driven safety scenarios.