AINeutralarXiv – CS AI · 3h ago5/10
🧠
ChildEval: When large language models meet children's personalities
Researchers introduce ChildEval, a benchmark dataset containing 29K synthesized persona profiles to evaluate how large language models understand and respond to children's preferences aged 3-6. The work addresses a gap in LLM evaluation by testing whether AI systems can infer and follow child-specific preferences in extended conversations, with results showing that fine-tuning on the benchmark improves child-centered performance.