AINeutralarXiv โ CS AI ยท 5h ago
๐ง
M-QUEST -- Meme Question-Understanding Evaluation on Semantics and Toxicity
Researchers developed M-QUEST, a new benchmark for evaluating AI models' ability to understand and detect toxicity in internet memes. The framework identifies 10 key dimensions for meme interpretation and tests 8 open-source language models, finding that instruction-tuned models perform better but still struggle with pragmatic inference.