AIBullisharXiv โ CS AI ยท 6h ago2
๐ง
CHIMERA: Compact Synthetic Data for Generalizable LLM Reasoning
Researchers introduce CHIMERA, a compact 9K-sample synthetic dataset that enables smaller AI models to achieve reasoning performance comparable to much larger models. The dataset addresses key challenges in training reasoning-capable LLMs through automated generation and cross-validation across 8 scientific disciplines.