AINeutralarXiv โ CS AI ยท 5h ago1
๐ง
SorryDB: Can AI Provers Complete Real-World Lean Theorems?
Researchers have introduced SorryDB, a dynamic benchmark for evaluating AI systems' ability to prove mathematical theorems using the Lean proof assistant. The benchmark draws from 78 real-world formalization projects and addresses limitations of static benchmarks by providing continuously updated tasks that better reflect community needs.