#rubric-rewards News & Analysis

2 articles tagged with #rubric-rewards. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles

AIBullisharXiv – CS AI · Jun 57/10

🧠

Improving Heart-Focused Medical Question Answering in LLMs via Variance-Aware Rubric Rewards with GRPO

Researchers demonstrate that Group Relative Policy Optimization (GRPO) combined with a novel Variance-Aware Reward Framework significantly improves smaller LLMs' performance on medical question answering, particularly for heart-related queries. The approach achieves 38% accuracy improvement on a held-out test set while remaining competitive with much larger models, offering a practical path toward efficient, deployable medical AI systems.

AIBullisharXiv – CS AI · Jun 16/10

🧠

LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards

Researchers introduce LongTraceRL, a reinforcement learning method that improves large language models' ability to reason over lengthy documents by using search agent trajectories and entity-level reward signals. The approach generates challenging training contexts with high-confusability distractors and applies rubric rewards that supervise intermediate reasoning steps, demonstrating consistent improvements across multiple LLM sizes and benchmarks.