AINeutralarXiv – CS AI · 3h ago6/10
🧠
Evaluating the Realism of LLM-powered Social Agents: A Case Study of Reactions to Spanish Online News
Researchers evaluated whether large language models can realistically simulate human behavior in online discourse by comparing LLM-generated reactions to Spanish news articles against real audience responses across hate speech, sentiment, and semantic alignment metrics. The study found that off-the-shelf models significantly underreproduce hate speech and introduce model-specific biases, while fine-tuning improves fidelity unevenly depending on the model.