y0news
#direct-preference-optimization1 article
1 articles
AIBullisharXiv โ€“ CS AI ยท 6h ago3
๐Ÿง 

Surgical Post-Training: Cutting Errors, Keeping Knowledge

Researchers introduce Surgical Post-Training (SPoT), a new method to improve Large Language Model reasoning while preventing catastrophic forgetting. SPoT achieved 6.2% accuracy improvement on Qwen3-8B using only 4k data pairs and 28 minutes of training, offering a more efficient alternative to traditional post-training approaches.