y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#benchmark-compression News & Analysis

2 articles tagged with #benchmark-compression. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles
AIBullisharXiv – CS AI · 8h ago7/10
🧠

Learning More from Less: Unlocking Internal Representations for Benchmark Compression

RepCore, a new method for compressing LLM benchmarks, uses aligned hidden states from neural networks to identify representative test subsets rather than relying solely on correctness labels. The approach achieves accurate performance estimation with as few as ten source models, addressing the statistical instability that plagues existing coreset methods when evaluation data is limited.

AIBullisharXiv – CS AI · 8h ago6/10
🧠

MINCE: Shrinking LLM Evaluation Datasets via Few-Model Monte Carlo Calibration

Researchers introduce MINCE, a novel method that significantly reduces the computational cost of evaluating large language models by intelligently shrinking benchmark datasets. Using Monte Carlo simulation with minimal calibration models, MINCE achieves 54-89% dataset size reductions while maintaining accuracy within acceptable drift thresholds, enabling 2.7-8.1x faster GPU evaluations.