y0news
← Feed
←Back to feed
🧠 AIβšͺ NeutralImportance 5/10

Rebuttals Move Peer-Review Scores, but Initial-Review Structure Bounds the Movement

arXiv – CS AI|Mathieu Louis, Tibo Vanleke, Vincent Ginis, Andres Algaba|
πŸ€–AI Summary

Researchers analyzed 73,000 reviewer trajectories from ICLR 2024-2025 to measure how author rebuttals affect peer-review scores. Using LLMs as measurement tools, they found that while rebuttals can move scores, initial review structure predicts most score movement, constraining rebuttal impact to measurable but bounded effects.

Analysis

This study addresses a fundamental transparency gap in academic peer review by quantifying rebuttal effectiveness using computational methods. Researchers leveraged archived pre- and post-rebuttal scores to isolate rebuttal content from confounding factors like reviewer confidence and discussion dynamics. The methodology is notable for treating LLMs as measurement instruments rather than decision-makers, maintaining human oversight while scaling analysis across 73,000 reviews.

The findings reveal nuanced dynamics in peer review that challenge assumptions about rebuttal influence. When review text initially reads below the assigned score, only 8.3% of reviewers increase scores post-rebuttal; this rate jumps to 31.9% when text reads above the score, suggesting reviewers resist contradiction but respond to substantive rebuttals. The developed 44-feature taxonomy of reviewer-author exchanges provides actionable categories for understanding successful versus failed rebuttals, with 23 features replicating across models and validation sets.

The predictive modeling demonstrates that initial review structure alone achieves 0.747 AUC for predicting score movement, constraining the maximum possible improvement from rebuttal information. Adding resolved exchange signals only raises this to 0.804, indicating structural factors dominate rebuttal content. This suggests review quality and initial reviewer positioning substantially predetermine outcomes before rebuttals occur.

These findings have implications for improving peer review processes. The research suggests focusing on initial review quality and reviewer calibration rather than assuming robust rebuttals overcome systemic issues. The identified failure modes in exchanges offer targets for training reviewers and guiding authors on effective rebuttal strategies. Understanding these constraints enables more realistic expectations about peer review reform.

Key Takeaways
  • β†’Initial review structure predicts most score movement (AUC 0.747), constraining rebuttal impact despite adding exchange information only raising AUC to 0.804
  • β†’Score increase rates vary dramatically from 8.3% to 31.9% depending on whether review text reads below or above assigned scores
  • β†’A 44-feature taxonomy of reviewer-author exchanges was developed, with 23 features replicating robustly across models and validation years
  • β†’Rebuttals show measurable but bounded effects, with most robust exchange signals reflecting rebuttal failure modes rather than successes
  • β†’The study analyzed 73,000 reviewer trajectories from ICLR 2024-2025 using LLMs as measurement instruments while maintaining human oversight
Mentioned in AI
Models
ClaudeAnthropic
OpusAnthropic
GeminiGoogle
Read Original β†’via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains β€” you keep full control of your keys.
Connect Wallet to AI β†’How it works
Related Articles