#curriculum-learning News & Analysis

55 articles tagged with #curriculum-learning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

55 articles

AINeutralarXiv – CS AI · Jun 106/10

🧠

Representation Curriculum: Stagewise Training for Robust Ranking and Allocation

Researchers propose Representation Curriculum (RC), a machine learning training method that improves ranking systems in digital marketplaces by strategically controlling when different data signals are introduced during model training. The approach reduces over-reliance on exposure-dependent historical signals and strengthens content-based merit evaluation, yielding better performance on cold-start scenarios and improved robustness across distribution shifts.

AINeutralarXiv – CS AI · Jun 106/10

🧠

RoboNaldo: Accurate, Stable and Powerful Humanoid Soccer Shooting via Motion-Guided Curriculum Reinforcement Learning

RoboNaldo, a motion-guided curriculum reinforcement learning framework, enables humanoid robots to perform accurate soccer shots with significantly improved stability and power compared to prior approaches. The system uses a three-stage training process that progresses from mimicking human motion to adapting kicks for varied ball positions and moving targets, achieving real-world performance on a Unitree G1 robot with shot errors under 1 meter from 3 meters away.

AINeutralarXiv – CS AI · Jun 96/10

🧠

Self-Paced Curriculum Reinforcement Learning for Autonomous Superbike Racing in Simulation

Researchers have developed a self-paced curriculum reinforcement learning framework for training autonomous agents to race superbikes in a physics-accurate simulator, combining Soft Actor-Critic algorithms with dynamic task progression. The approach demonstrates superior training efficiency and performance compared to traditional RL methods, establishing a new baseline for two-wheeled autonomous racing where balance and lean dynamics significantly increase complexity over four-wheeled vehicles.

AINeutralarXiv – CS AI · Jun 96/10

🧠

Data Synthesis and Parameter-Efficient Fine-Tuning for Low-Resource NMT: A Case Study on Q'eqchi' Mayan

Researchers developed a data synthesis methodology for neural machine translation of Q'eqchi' Mayan, using synthetic corpora derived from community dictionaries and Parameter-Efficient Fine-Tuning to avoid extractive web-scraping. While the approach achieved strong structural performance (BLEU 42.02 on synthetic data), it revealed a critical gap: the model excels at learning grammar but fails to acquire authentic semantic grounding (BLEU 0.59 on organic text), suggesting synthetic bootstrapping alone cannot replace real-world linguistic diversity.

AIBullisharXiv – CS AI · Jun 96/10

🧠

CLPO: Curriculum Learning meets Policy Optimization for LLM Reasoning

Researchers introduce CLPO, a curriculum learning framework that dynamically adapts training difficulty for large language models during reinforcement learning. The approach automatically identifies solved, medium, and hard problems, then strategically restructures tasks to match the model's evolving capabilities, achieving substantial improvements over existing methods on mathematical and reasoning benchmarks.

AINeutralarXiv – CS AI · Jun 56/10

🧠

Severity-Aware Curriculum Learning with Multi-Model Response Selection for Medical Text Generation

Researchers introduce a severity-aware curriculum learning framework for medical text generation that trains multiple large language models sequentially on cases of increasing complexity, then selects the best response during inference. The approach achieves 90.30% performance on the MAQA dataset, demonstrating that combining progressive training strategies with multi-model ensembles improves medical AI reliability across varying case severities.

AINeutralarXiv – CS AI · Jun 46/10

🧠

Outcome-Based RL Provably Leads Transformers to Reason, but Only With the Right Data

Researchers prove that Transformers trained with reinforcement learning and outcome-based rewards spontaneously develop chain-of-thought reasoning capabilities, but only when training data includes sufficient 'simple examples' requiring fewer reasoning steps. The findings bridge theory and practice, explaining how sparse reward signals drive emergence of interpretable algorithmic behavior in language models.

AIBullisharXiv – CS AI · Jun 16/10

🧠

D$^3$: Dynamic Directional Graph-Constrained Data Scheduling for LLM Training

Researchers introduce D³, a novel data scheduling framework for LLM training that models interactions between training samples as a dynamic directional graph to optimize training order. The approach outperforms existing data scheduling methods while maintaining computational efficiency through an approximation algorithm.

AINeutralarXiv – CS AI · Jun 16/10

🧠

PROWL: Prioritized Regret-Driven Optimization for World Model Learning

Researchers introduce PROWL, an adversarial training framework that improves world model robustness by actively discovering failure modes rather than passively learning from demonstration data. The approach uses a KL-constrained policy to expose high-error trajectories in diffusion-based video models while maintaining behavioral constraints, with a prioritized buffer that focuses training on unresolved weaknesses. Results demonstrate significant improvements in handling rare, interaction-critical transitions critical for downstream planning and policy performance.

AINeutralarXiv – CS AI · May 296/10

🧠

Tailoring the Curriculum: Student-Centered Reasoning Distillation via Dynamic Data-Model Compatibility

Researchers introduce the Data-Model Compatibility (DMC) metric to evaluate how well training datasets align with student models during reasoning distillation from large language models. The metric jointly assesses data quality, difficulty, and student capability, demonstrating strong correlation with distillation performance and enabling dynamic dataset selection that improves outcomes across multiple models and tasks.

AINeutralarXiv – CS AI · May 296/10

🧠

Micro-Macro Retrieval: Reducing Long-Form Hallucination in Large Language Models

Researchers propose Micro-Macro Retrieval (M2R), a framework that reduces hallucination in large language models during long-form text generation by keeping key information closer to model outputs. The method combines coarse-grained external retrieval with fine-grained extraction from an internal knowledge repository, addressing a critical bottleneck where proximity of evidence to final answers directly correlates with factual accuracy.

AIBullisharXiv – CS AI · May 296/10

🧠

Beyond Normalization: Rethinking the Partition Function as a Difficulty Scheduler for RLVR

Researchers propose PACED-RL, a novel post-training framework that reinterprets the partition function in GFlowNet-based LLM training as a difficulty scheduler rather than merely a normalizer. By leveraging per-prompt accuracy signals, the method improves sample efficiency and maintains generation diversity while outperforming existing reward-maximizing approaches.

AINeutralarXiv – CS AI · May 286/10

🧠

Mechanistically Interpreting the Role of Sample Difficulty in RLVR for LLMs

Researchers mechanistically analyze how sample difficulty affects Reinforcement Learning with Verifiable Reward (RLVR) training in large language models, discovering that medium-difficulty problems yield optimal reasoning improvements while overly hard problems degrade performance. The study proposes difficulty-adaptive strategies using backward-reasoning reformulation and sparse autoencoders to optimize reward signals during training.

AINeutralarXiv – CS AI · May 286/10

🧠

Restoring the Sweet Spot: Pass-Rate Weighted Self-Distillation for LLM Reasoning

Researchers propose SC-SDPO, an improved machine learning technique that enhances how large language models learn from their own feedback during training. By weighting training examples based on question difficulty, the method achieves 3-4% performance gains on reasoning benchmarks while maintaining stable training dynamics.

AIBullisharXiv – CS AI · May 276/10

🧠

Learning to Act under Noise: Enhancing Agent Robustness via Noisy Environments

Researchers introduce NoisyAgent, a training framework that improves large language model agent robustness by deliberately exposing them to environmental imperfections during training. By simulating real-world interaction noise—including user ambiguity and tool failures—the approach bridges the gap between idealized benchmark performance and practical deployment reliability.

AINeutralarXiv – CS AI · May 126/10

🧠

CrossVL: Complexity-Aware Feature Routing and Paired Curriculum for Cross-View Vision-Language Detection

CrossVL introduces a novel framework combining Complexity-Aware Pathway Aggregation and Paired Curriculum Learning to improve vision-language model performance in cross-view object detection scenarios. The approach addresses fundamental challenges when models operate across different viewpoints (ground and aerial), achieving measurable improvements in detection accuracy and consistency on the MAVREC dataset.

AIBullisharXiv – CS AI · May 116/10

🧠

Goldilocks RL: Tuning Task Difficulty to Escape Sparse Rewards for Reasoning

Researchers introduce Goldilocks, a curriculum learning strategy that improves reinforcement learning efficiency for language models by having a teacher model dynamically select training questions of optimal difficulty for the student model. This addresses the sample inefficiency problem in sparse-reward RL training and demonstrates performance gains on reasoning tasks compared to standard approaches.

AINeutralarXiv – CS AI · May 76/10

🧠

Overcoming Environmental Meta-Stationarity in MARL via Adaptive Curriculum and Counterfactual Group Advantage

Researchers propose CL-MARL, a curriculum learning framework for multi-agent reinforcement learning that dynamically adjusts task difficulty based on agent performance, addressing a fundamental limitation where fixed-difficulty training constrains policy generalization. The method achieves 40% win rate on complex cooperative tasks, outperforming existing baselines by significant margins.

AINeutralarXiv – CS AI · May 46/10

🧠

Improving LLM Code Generation via Requirement-Aware Curriculum Reinforcement Learning

Researchers propose RECRL, a requirement-aware curriculum reinforcement learning framework that improves large language model code generation by better perceiving programming requirement difficulty, optimizing challenging requirements, and employing adaptive sampling strategies. Testing across five LLMs and benchmarks shows 1.23%-5.62% average improvement in Pass@1 metrics compared to existing approaches.

AINeutralarXiv – CS AI · Apr 206/10

🧠

CLewR: Curriculum Learning with Restarts for Machine Translation Preference Learning

Researchers introduce CLewR, a curriculum learning strategy that improves machine translation performance in large language models by reordering training data from easy to hard examples with periodic restarts. The approach demonstrates consistent improvements across multiple model families and preference optimization techniques, addressing a previously underexplored aspect of LLM training methodology.

🧠 Llama

AIBullisharXiv – CS AI · Apr 76/10

🧠

Vocabulary Dropout for Curriculum Diversity in LLM Co-Evolution

Researchers introduce vocabulary dropout, a technique to prevent diversity collapse in co-evolutionary language model training where one model generates problems and another solves them. The method sustains proposer diversity and improves mathematical reasoning performance by +4.4 points on average in Qwen3 models.

AIBullisharXiv – CS AI · Mar 266/10

🧠

A Deep Dive into Scaling RL for Code Generation with Synthetic Data and Curricula

Researchers developed a scalable multi-turn synthetic data generation pipeline using reinforcement learning to improve large language models' code generation capabilities. The approach uses teacher models to create structured difficulty progressions and curriculum-based training, showing consistent improvements in code generation across Llama3.1-8B and Qwen models.

🧠 Llama

AIBullisharXiv – CS AI · Mar 176/10

🧠

Curriculum Reinforcement Learning from Easy to Hard Tasks Improves LLM Reasoning

Researchers developed E2H Reasoner, a curriculum reinforcement learning method that improves LLM reasoning by training on tasks from easy to hard. The approach shows significant improvements for small LLMs (1.5B-3B parameters) that struggle with vanilla RL training alone.

AIBullisharXiv – CS AI · Mar 166/10

🧠

CRAFT-GUI: Curriculum-Reinforced Agent For GUI Tasks

Researchers introduce CRAFT-GUI, a curriculum learning framework that uses reinforcement learning to improve AI agents' performance in graphical user interface tasks. The method addresses difficulty variation across GUI tasks and provides more nuanced feedback, achieving 5.6% improvement on Android Control benchmarks and 10.3% on internal benchmarks.

AINeutralarXiv – CS AI · Mar 55/10

🧠

Curriculum-enhanced GroupDRO: Challenging the Norm of Avoiding Curriculum Learning in Subpopulation Shift Setups

Researchers propose Curriculum-enhanced Group Distributionally Robust Optimization (CeGDRO), a new machine learning approach that challenges conventional wisdom by using curriculum learning in subpopulation shift scenarios. The method achieves up to 6.2% improvement over state-of-the-art results on benchmark datasets like Waterbirds by strategically prioritizing hard bias-confirming and easy bias-conflicting samples.

← PrevPage 2 of 3Next →