AIBullisharXiv โ CS AI ยท 7h ago6/10
๐ง
Contrastive Decoding Mitigates Score Range Bias in LLM-as-a-Judge
Researchers demonstrate that Large Language Models used as judges suffer from score range bias, where evaluation outputs are highly sensitive to predefined scoring scales. Using contrastive decoding techniques, they achieve up to 11.7% improvement in alignment with human judgments across different score ranges.