AINeutralarXiv – CS AI · 6h ago6/10
🧠
AURA: Adaptive Uncertainty-aware Refinement for LLM-as-a-Judge Auditing
Researchers introduce AURA, a framework that improves the reliability of using large language models as judges for evaluating generated text by iteratively learning human-consistency patterns and prioritizing uncertain comparisons for human review. The approach addresses the core challenge that LLM judges often reflect their own biases rather than genuine human preferences, even when some human feedback is available.