AIBullisharXiv – CS AI · 9h ago7/10
🧠
When Losses Align: Gradient-Based Composite Loss Weighting for Efficient Pretraining
Researchers propose a gradient-based bilevel optimization method that automatically learns composite loss weights during pretraining by aligning gradients with downstream objectives. The approach reduces hyperparameter tuning overhead to ~30% above baseline training cost while matching or exceeding manually tuned baselines across event-sequence and computer vision tasks.