AIBullisharXiv – CS AI · 15h ago7/10
🧠
ScientistOne: Towards Human-Level Autonomous Research via Chain-of-Evidence
ScientistOne introduces Chain-of-Evidence, a verifiability framework addressing critical failures in autonomous research systems where AI agents produce plausible-looking but unreliable outputs including fabricated citations, unverified scores, and misaligned methods. The system achieves zero hallucinated references and perfect score verification across five research tasks, significantly outperforming existing baseline systems that exhibit systematic failure rates up to 80%.