AINeutralarXiv – CS AI · 8h ago6/10
🧠
REAL: Regression-Aware Reinforcement Learning for LLM-as-a-Judge
Researchers introduce REAL, a reinforcement learning framework that optimizes LLMs used as automated evaluators by recognizing ordinal relationships in scoring tasks rather than treating outputs as binary outcomes. The method demonstrates significant performance improvements across model scales, achieving up to +8.40 Pearson correlation gains on Qwen3-32B compared to supervised fine-tuning baselines.