AINeutralarXiv – CS AI · 3h ago6/10
🧠
EVADE-Bench: Multimodal Benchmark for Evaluating and Enhancing Evasive Content Detection
Researchers introduce EVADE-Bench, a multimodal benchmark for evaluating how well AI models detect deliberately obfuscated content in e-commerce, such as products using word splitting or euphemistic language to evade moderation policies. Testing 26 leading LLMs and VLMs reveals significant vulnerabilities in even state-of-the-art models, with findings suggesting that clearer rule design and multi-agent reasoning architectures can substantially improve detection accuracy.