AIBearisharXiv – CS AI · 9h ago7/10
🧠
Stability vs. Manipulability: Evaluating Robustness Under Post-Decision Interaction in LLM Judges
Researchers demonstrate that LLM-based judges used in AI benchmarking are highly vulnerable to manipulation through post-decision interaction, with targeted challenges capable of overturning initial evaluations despite high confidence scores. This vulnerability introduces a critical failure mode in automated evaluation systems that could degrade benchmark reliability and ranking accuracy.