AIBullisharXiv β CS AI Β· 4h ago7/10
π§
Towards grounded autonomous research: an end-to-end LLM mini research loop on published computational physics
Researchers demonstrate an autonomous LLM agent capable of executing a complete research loopβreading, reproducing, critiquing, and extending computational physics papers. Testing across 111 papers reveals the agent identifies substantive flaws in 42% of cases, with 97.7% of issues requiring actual computation to detect, and produces a publishable peer-review comment on a Nature Communications paper without human direction.