AIBullisharXiv – CS AI · 15h ago7/10
🧠
E3: Issue-Level Backtesting for Automated Research Critique
Researchers introduce E3, an automated review assistant that identifies technical concerns in research papers with 90.2% recall—outperforming human reviewers and leading AI models. The system detects unsupported claims, missing ablations, weak baselines, and validity threats, with evaluation conducted on 100 ICLR 2026 papers using a contamination-resistant backtesting protocol.
🏢 OpenAI🏢 Anthropic🧠 GPT-5