AINeutralarXiv โ CS AI ยท 14h ago6/10
๐ง
Can Small Training Runs Reliably Guide Data Curation? Rethinking Proxy-Model Practice
Researchers demonstrate that small-scale proxy models commonly used by AI companies to evaluate data curation strategies produce unreliable conclusions because optimal training configurations are data-dependent. They propose using reduced learning rates in proxy model training as a simple, cost-effective solution that better predicts full-scale model performance across diverse data recipes.
๐ข Meta