AIBearisharXiv – CS AI · 14h ago7/10
🧠
Review Arcade: On the Human Alignment and Gameability of LLM Reviews
Researchers evaluated LLM-generated peer reviews for scientific papers using ACL Rolling Review data, finding limited alignment between LLM and human reviews while discovering that authors can strategically game LLM feedback to improve paper scores by up to 35%. The study highlights emerging risks in automated academic review systems as both reviewers and authors increasingly leverage language models.