🧠 AI⚪ NeutralImportance 6/10

BoostAPR: Boosting Automated Program Repair via Execution-Grounded Reinforcement Learning with Dual Reward Models

arXiv – CS AI|Yuanhao Li, Hongbo Wang, Xiaotang Shang, Xunzhu Tang, Yiming Cao, Xuhong Chen|May 12, 2026 at 04:00 AM

🤖AI Summary

BoostAPR is a new AI framework that improves automated program repair by using dual reward models and reinforcement learning to identify which code edits actually fix bugs. The system achieves significant improvements on multiple benchmarks, including 40.7% on SWE-bench Verified, demonstrating that more granular feedback mechanisms can substantially enhance AI's ability to repair software vulnerabilities.

Analysis

BoostAPR addresses a fundamental challenge in using reinforcement learning for code repair: the difficulty of identifying which specific edits contribute to fixing bugs when only end-to-end execution feedback is available. Traditional approaches suffer from sparse reward signals and coarse-grained assessments that leave the model uncertain about causality between changes and outcomes. The framework's innovation lies in its dual-model architecture, where line-level credit assignment operates at an intermediate granularity more natural to how developers think about code changes, while sequence-level assessment provides overall validation.

This research builds on the broader trend of applying machine learning to software engineering tasks, following earlier work on neural program repair and the creation of benchmarks like SWE-Gym. The progression from supervised learning to reinforcement learning with increasingly sophisticated reward structures reflects the field's maturation in handling the complexity of code generation at scale.

For the AI development community, BoostAPR's results are significant because they demonstrate that careful architectural choices in reward modeling can yield substantial improvements—a 22.9 percentage point gain over the baseline model is substantial. The cross-language transfer results (Python-to-Java) suggest the approach captures generalizable repair strategies rather than memorized patterns.

Looking ahead, the technique's applicability to other code generation tasks makes it relevant for developers building automated software maintenance systems. Subsequent work may explore whether similar dual-reward architectures benefit other structured generation problems beyond program repair, potentially influencing how reinforcement learning is applied to AI coding assistants.

Key Takeaways

→BoostAPR uses dual reward models to provide line-level credit assignment for code repairs, achieving 40.7% on SWE-bench Verified
→The framework combines supervised fine-tuning on execution-verified demonstrations with PPO optimization using granular feedback signals
→Strong cross-language transfer results (24.8% on Defects4J Python-to-Java) indicate learned repair strategies generalize beyond training data
→Line-level credit allocation at intermediate granularity proves more effective than sequence-level rewards alone for identifying critical edits
→Results are competitive with open-source models while maintaining interpretability about which code regions drive successful repairs

#program-repair #reinforcement-learning #reward-modeling #code-generation #machine-learning #software-engineering #swe-bench #ai-research

Read Original →via arXiv – CS AI

Act on this with AI

Stay ahead of the market.

Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.

Connect Wallet to AI →How it works

AIMay 6

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

AIMay 6

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

AIMay 6

BoostAPR: Boosting Automated Program Repair via Execution-Grounded Reinforcement Learning with Dual Reward Models

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

S&P 500 and NASDAQ hit record highs as AI chip stocks surge