y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#benchmark-leakage News & Analysis

1 article tagged with #benchmark-leakage. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AIBearisharXiv – CS AI · 7h ago7/10
🧠

NumLeak: Public Numeric Benchmarks as Latent Labels in Foundation Models

Researchers introduce NumLeak, a framework revealing that frontier large language models memorize public numeric benchmarks from pretraining data rather than genuinely understanding underlying concepts. The study demonstrates that models achieve near-perfect recall on financial and economic metrics when prompted with dates, but this performance collapses on recent holdout data, indicating memorization rather than reasoning capability.