y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#loss-reweighting News & Analysis

1 article tagged with #loss-reweighting. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AIBullisharXiv – CS AI · 14h ago7/10
🧠

Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning

Researchers propose PEAR, a novel supervised fine-tuning (SFT) method that optimizes language models with downstream reinforcement learning in mind rather than in isolation. The approach uses importance sampling to reweight training data, addressing a critical distribution mismatch between offline SFT and online RL stages, achieving up to 14.6% performance gains on mathematical reasoning benchmarks.