AINeutralarXiv – CS AI · 7h ago6/10
🧠
Small Experiments, Cheaper Decisions: A Case Study in Staged Promotion for Micro-Pretraining
Researchers present a staged-promotion protocol for efficiently screening machine learning configurations during micro-pretraining, using fixed budget increments across heterogeneous hardware to reduce experimental costs while mitigating the risk of selecting configurations that perform well only at tiny scales. The study demonstrates that early-stage rankings are unstable across hardware types, but a frozen promotion rule successfully identified a consistent top performer while reducing total GPU-hours from 432 to 169.2.