AINeutralarXiv – CS AI · 4h ago6/10
🧠
RWGBench: Evaluating Scholarly Positioning in Related Work Generation
Researchers introduce RWGBench, a new evaluation framework for assessing how well AI language models generate related work sections in academic papers. Unlike existing metrics that measure text similarity, RWGBench evaluates citation selection and scholarly positioning—capturing whether models choose appropriate references and frame them correctly, revealing limitations current systems obscure.