AINeutralarXiv – CS AI · 15h ago6/10
🧠
AMARIS: A Memory-Augmented Rubric Improvement System for Rubric-Based Reinforcement Learning
AMARIS is a new system that improves how large language models are trained using reinforcement learning by maintaining a persistent memory of past training data and failures. Unlike existing methods that only look at immediate, local information, AMARIS tracks recurring problems and previous rubric adjustments over time, achieving measurable performance improvements across multiple domains.