AINeutralarXiv – CS AI · Mar 266/10
🧠
Assessment Design in the AI Era: A Method for Identifying Items Functioning Differentially for Humans and Chatbots
Researchers developed a method using Differential Item Functioning (DIF) analysis to identify systematic differences between human and AI chatbot performance on educational assessments. The study tested six leading chatbots including ChatGPT-4o, Gemini, and Claude on chemistry and entrance exams to help educators design AI-resistant assessments.
🏢 Meta🧠 ChatGPT🧠 Claude