y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#grpo News & Analysis

29 articles tagged with #grpo. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

29 articles
AIBullisharXiv – CS AI · Mar 36/104
🧠

Group-Relative REINFORCE Is Secretly an Off-Policy Algorithm: Demystifying Some Myths About GRPO and Its Friends

Researchers demonstrate that Group Relative Policy Optimization (GRPO), traditionally viewed as an on-policy reinforcement learning algorithm, can be reinterpreted as an off-policy algorithm through first-principles analysis. This theoretical breakthrough provides new insights for optimizing reinforcement learning applications in large language models and offers principled approaches for off-policy RL algorithm design.

AIBullisharXiv – CS AI · Feb 276/105
🧠

NoRD: A Data-Efficient Vision-Language-Action Model that Drives without Reasoning

Researchers introduced NoRD (No Reasoning for Driving), a Vision-Language-Action model for autonomous driving that achieves competitive performance using 60% less training data and no reasoning annotations. The model incorporates Dr. GRPO algorithm to overcome difficulty bias issues in reinforcement learning, demonstrating successful results on Waymo and NAVSIM benchmarks.

AINeutralarXiv – CS AI · Feb 275/108
🧠

Soft Sequence Policy Optimization

Researchers introduce Soft Sequence Policy Optimization (SSPO), a new reinforcement learning method for training Large Language Models that improves upon existing policy optimization approaches. The technique uses soft gating functions and sequence-level importance sampling to enhance training stability and performance in mathematical reasoning tasks.

AINeutralHugging Face Blog · May 253/105
🧠

🐯 Liger GRPO meets TRL

The article appears to be about Liger GRPO (Generalized Reward Preference Optimization) integrating with TRL (Transformer Reinforcement Learning), but the article body is empty. Without content, this seems to be a technical development in AI model training and optimization.

← PrevPage 2 of 2