HERMES: Towards Efficient and Verifiable Mathematical Reasoning in LLMs
Researchers introduce Hermes, an AI agent that combines informal reasoning with formally verified mathematical proofs in Lean, achieving up to 40% accuracy improvements on difficult math benchmarks while reducing computational costs by 80%. The system addresses a fundamental limitation in LLM reasoning by interleaving exploratory problem-solving with rigorous formal verification.