AINeutralarXiv โ CS AI ยท 7h ago6/10
๐ง
Evaluating LLMs as Human Surrogates in Controlled Experiments
Researchers compared large language models with human responses in a behavioral study on accuracy perception, finding that LLMs reproduce directional effects but with inconsistent effect magnitudes across different models. The study reveals that off-the-shelf LLMs can simulate some human belief-updating patterns in controlled experiments but lack reliable human-scale accuracy, establishing clearer boundaries for when synthetic LLM data is appropriate for behavioral research.