y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#lean-4 News & Analysis

2 articles tagged with #lean-4. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles
AINeutralarXiv – CS AI · Apr 107/10
🧠

The Defense Trilemma: Why Prompt Injection Defense Wrappers Fail?

Researchers prove mathematically that no continuous input-preprocessing defense can simultaneously maintain utility, preserve model functionality, and guarantee safety against prompt injection attacks in language models with connected prompt spaces. The findings establish a fundamental trilemma showing that defenses must inevitably fail at some threshold inputs, with results verified in Lean 4 and validated empirically across three LLMs.

AINeutralarXiv – CS AI · May 126/10
🧠

FormalRewardBench: A Benchmark for Formal Theorem Proving Reward Models

Researchers introduce FormalRewardBench, the first benchmark for evaluating reward models in formal theorem proving using Lean 4. The benchmark reveals that frontier LLMs like Claude Opus outperform specialized theorem provers at evaluating proof quality, suggesting that theorem proving ability does not transfer to proof evaluation tasks.

🧠 Claude🧠 Opus