AINeutralarXiv โ CS AI ยท 1d ago6/10
๐ง
Assessment Design in the AI Era: A Method for Identifying Items Functioning Differentially for Humans and Chatbots
Researchers developed a method using Differential Item Functioning (DIF) analysis to identify systematic differences between human and AI chatbot performance on educational assessments. The study tested six leading chatbots including ChatGPT-4o, Gemini, and Claude on chemistry and entrance exams to help educators design AI-resistant assessments.
๐ข Meta๐ง ChatGPT๐ง Claude