#academic-publishing News & Analysis

21 articles tagged with #academic-publishing. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

21 articles

AIBearisharXiv – CS AI · Jun 107/10

🧠

Gaming AI-Assisted Peer Reviews Poses New Risks to the Scientific Community

Researchers demonstrate that AI-assisted peer review systems are vulnerable to simple adversarial attacks, with superficial abstract rephrasing increasing acceptance ratings by up to 1.31 points on a 10-point scale without changing underlying scientific content. The low-cost manipulation ($1, 5 minutes) reveals systemic risks in AI-mediated scientific evaluation and raises concerns about authors optimizing for algorithmic judgment rather than merit.

🧠 GPT-5🧠 Gemini

AINeutralarXiv – CS AI · May 297/10

🧠

PRAIB: Peer Review AI Benchmark of Behaviour of LLM-Assisted Reviewing

Researchers introduce PRAIB, a benchmark framework that evaluates how Large Language Models perform peer review compared to human reviewers. Analysis of 11,000 LLM-generated reviews across major AI conferences reveals significant behavioral divergences: LLM ratings show less variability, positive bias, overconfidence, and frequently miss atomic weaknesses that human reviewers catch.

AIBearisharXiv – CS AI · May 97/10

🧠

When AI Meets Science: Research Diversity, Interdisciplinarity, Visibility, and Retractions across Disciplines in a Global Surge

A comprehensive study reveals that while AI adoption in research has surged exponentially since 2015, the technology remains concentrated in narrow domains tied to computer science with limited epistemological transformation. The research identifies concerning patterns including higher retraction rates in AI-supported work, citation inflation, and geographic disparities in adoption across countries and disciplines.

AINeutralarXiv – CS AI · Jun 236/10

🧠

PeerCheck: Enhancing LLM-Generated Academic Reviews Towards Human-Level Quality

Researchers introduce PeerCheck, a framework that analyzes differences between LLM-generated and human-written academic reviews, finding that LLMs prioritize theoretical aspects while humans emphasize methodology. Using techniques like Chain-of-Thought prompting improves LLM review quality, though retrieval-augmented generation surprisingly produces inconsistent and sometimes degraded results.

AINeutralarXiv – CS AI · Jun 235/10

🧠

Rebuttals Move Peer-Review Scores, but Initial-Review Structure Bounds the Movement

Researchers analyzed 73,000 reviewer trajectories from ICLR 2024-2025 to measure how author rebuttals affect peer-review scores. Using LLMs as measurement tools, they found that while rebuttals can move scores, initial review structure predicts most score movement, constraining rebuttal impact to measurable but bounded effects.

🧠 Claude🧠 Opus🧠 Gemini

AINeutralarXiv – CS AI · Jun 196/10

🧠

Benchmarking Agentic Review Systems

Researchers benchmarked AI-powered peer review systems across multiple models and datasets, finding that the best configurations achieve 83% accuracy in ranking papers by quality and catch 71.6% of intentionally injected errors. While AI review systems show promise in tracking human quality judgments and earning positive user feedback, they still require substantial improvement before serving as primary peer review mechanisms.

🧠 GPT-5

AINeutralarXiv – CS AI · Jun 196/10

🧠

Charting the Future of Scholarly Knowledge with AI: A Community Perspective

Researchers across disciplines are independently developing AI tools to manage the explosion of scholarly publications, but limited cross-community collaboration is slowing progress. The article advocates for fostering dialogue between research communities to identify shared challenges, exchange best practices, and create more integrated solutions for knowledge organization and extraction.

AINeutralarXiv – CS AI · Jun 56/10

🧠

EGTR-Review: Efficient Evidence-Grounded Scientific Peer Review Generation via Multi-Agent Teacher Distillation

EGTR-Review presents a novel framework for automating scientific peer review using a multi-agent teacher model that distills its reasoning into a lightweight student model, achieving superior performance with significantly lower computational costs while maintaining evidence traceability and factual grounding.

AINeutralarXiv – CS AI · Jun 45/10

🧠

Automatic Generation of Titles for Research Papers Using Language Models

Researchers propose an automated technique for generating research paper titles from abstracts using large language models, testing multiple approaches including fine-tuned PEGASUS and zero-shot GPT-3.5-turbo. Fine-tuned PEGASUS-large emerges as the top performer, though ChatGPT demonstrates creative title generation capabilities, suggesting AI-generated titles are practical and reliable for academic publishing workflows.

🧠 ChatGPT

AINeutralarXiv – CS AI · Jun 26/10

🧠

Can AI Review Improve Paper Drafting? An Empirical Study on 20 Computer Architecture Submissions

Researchers developed AI-Paper-Review, a tool that generates structured peer review feedback for academic papers using multiple AI reviewers, and conducted a case study on 20 computer architecture submissions to measure how well AI review aligns with human review. The study finds that AI review can identify significant portions of human-raised issues while also surfacing problems missed by human reviewers, raising important questions about AI's role in academic peer review without endorsing its use for formal publication decisions.

AINeutralarXiv – CS AI · Jun 16/10

🧠

Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs

Researchers introduce Crafter, a multi-agent system for generating publication-quality scientific figures from diverse inputs that generalizes across figure types without architectural changes. The work addresses a critical gap in automation tools by enabling editable SVG outputs and introduces CraftBench, a comprehensive benchmark for evaluating figure generation across multiple types and input conditions.

AINeutralarXiv – CS AI · May 286/10

🧠

DiagramRAG: A Lightweight Framework to Retrieve Scientific Diagram for Figure Generation

DiagramRAG is a new retrieval-augmented framework that converts rough sketches into publication-quality scientific diagrams by retrieving semantically and topologically compatible reference diagrams. The system achieves strong performance metrics (F1-scores of 0.848 and 0.802 on benchmark datasets) while maintaining efficient inference at 35.48 seconds per sample.

🏢 Hugging Face

AINeutralarXiv – CS AI · May 276/10

🧠

TADDLE: A Tool-Augmented Agent for Detecting Deficient LLM-Generated Peer Reviews

Researchers introduce TADDLE, an AI system that detects quality deficiencies in LLM-generated peer reviews by decomposing analysis into specialized tools and multi-label classification. The work addresses a growing problem in academic publishing where AI-written reviews are fluent but potentially flawed, backed by the first expert-annotated benchmark of 1,800 reviews across six defect categories.

AINeutralarXiv – CS AI · May 276/10

🧠

AI evaluation may bias perceptions: The importance of context in interpreting academic writing

A new study demonstrates that pooled benchmarks for detecting AI-generated academic text systematically misrepresent AI adoption across countries and research fields by ignoring contextual stylistic variations. Using country-field-specific benchmarks instead provides more accurate measurements and reveals that previous estimates substantially over- or underestimated AI use depending on geographic and disciplinary context.

AINeutralarXiv – CS AI · May 276/10

🧠

CitePrism: Human-in-the-Loop AI for Citation Auditing and Editorial Integrity

CitePrism introduces a human-in-the-loop AI framework designed to assist editors and reviewers in auditing manuscript citations for relevance, accuracy, and ethical appropriateness. The system combines large language models, semantic similarity analysis, and metadata verification to flag potentially problematic citations, achieving moderate agreement with human reviewers in preliminary testing on a pavement engineering manuscript.

AINeutralarXiv – CS AI · May 126/10

🧠

PaperFit: Vision-in-the-Loop Typesetting Optimization for Scientific Documents

Researchers introduce PaperFit, a vision-in-the-loop AI agent that automates the typesetting optimization of LaTeX scientific documents by iteratively rendering pages, diagnosing visual defects, and applying constrained repairs. The work formalizes Visual Typesetting Optimization (VTO) as a critical missing stage in document automation, addressing the gap between compilable but visually flawed PDFs and publication-ready outputs through a new benchmark of 200 papers.

AINeutralarXiv – CS AI · May 16/10

🧠

Can AI Be a Good Peer Reviewer? A Survey of Peer Review Process, Evaluation, and the Future

A comprehensive survey examines how large language models can assist or automate peer review processes across academia, synthesizing techniques for review generation, post-review tasks, and evaluation methods. The research catalogs datasets and modeling approaches while addressing ethical concerns and practical implementation challenges for integrating AI into scholarly publishing workflows.

AIBullisharXiv – CS AI · Apr 156/10

🧠

GoodPoint: Learning Constructive Scientific Paper Feedback from Author Responses

Researchers introduce GoodPoint, an AI system trained to generate constructive scientific feedback by learning from author responses to peer review. The method improves feedback quality by 83.7% over baseline models and outperforms larger LLMs like Gemini-3-flash, demonstrating that specialized training on valid, actionable feedback signals yields better results than general-purpose models.

🧠 Gemini

AINeutralarXiv – CS AI · Apr 146/10

🧠

NovBench: Evaluating Large Language Models on Academic Paper Novelty Assessment

Researchers introduced NovBench, the first large-scale benchmark for evaluating how well large language models can assess research novelty in academic papers. The benchmark comprises 1,684 paper-review pairs from a leading NLP conference and reveals that current LLMs struggle with scientific novelty comprehension despite promise in peer review support.

AINeutralarXiv – CS AI · Apr 76/10

🧠

FactReview: Evidence-Grounded Reviews with Literature Positioning and Execution-Based Claim Verification

Researchers introduce FactReview, an AI system that improves academic peer review by combining claim extraction, literature positioning, and code execution to verify research claims. The system addresses weaknesses in current LLM-based reviewing by grounding assessments in external evidence rather than relying solely on manuscript narratives.

$MKR

AINeutralarXiv – CS AI · Mar 114/10

🧠

RbtAct: Rebuttal as Supervision for Actionable Review Feedback Generation

Researchers propose RbtAct, a novel approach that uses peer review rebuttals as supervision to train AI models for generating more actionable scientific review feedback. The system leverages a new dataset RMR-75K and fine-tuned Llama-3.1-8B model to produce focused, implementable guidance rather than superficial comments.

🧠 Llama