AIBullisharXiv – CS AI · 10h ago7/10
🧠
Curriculum Reinforcement Learning Can Incentivize Reasoning Capacity in LLMs Beyond the Base Model
Researchers present a boundary-aware Curriculum Reinforcement Learning approach that improves large language model reasoning capacity beyond what standard RLVR methods achieve. Testing across Qwen, Llama, and DeepSeek models shows 9.8 percentage point improvements in pass@256 scores over base models, suggesting a more scalable path for continuous LLM advancement.
🧠 Llama