AINeutralarXiv โ CS AI ยท 5h ago
๐ง
Automated Concept Discovery for LLM-as-a-Judge Preference Analysis
Researchers developed automated methods to discover biases in Large Language Models when used as judges, analyzing over 27,000 paired responses. The study found LLMs exhibit systematic biases including preference for refusing sensitive requests more than humans, favoring concrete and empathetic responses, and showing bias against certain legal guidance.