Analytics Digests Sources Topics RSS AI Crypto

#memorization-bias News & Analysis

1 article tagged with #memorization-bias. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles

AINeutralarXiv – CS AI · May 117/10

🧠

GSM-SEM: Benchmark and Framework for Generating Semantically Variant Augmentations

Researchers introduce GSM-SEM, a framework for generating semantically diverse variants of math benchmarks like GSM8K to combat memorization in LLM evaluations. Testing 14 state-of-the-art models reveals consistent performance drops averaging 28%, suggesting current leaderboard rankings may overstate true reasoning capabilities.