AIBullisharXiv – CS AI · 7h ago7/10
🧠
Pull Requests as a Training Signal for Repo-Level Code Editing
Researchers introduce Clean-PR, a training methodology that leverages 2 million real-world GitHub pull requests to improve AI models' ability to perform repository-level code editing. The approach achieves significant performance gains on SWE-bench benchmarks without relying on complex agent scaffolding, demonstrating that code editing capabilities can be effectively internalized into model weights through high-quality training signals.