AINeutralarXiv – CS AI · 10h ago7/10
🧠
Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values
Researchers introduce Agent-ValueBench, the first comprehensive benchmark designed to measure and evaluate the values embedded in autonomous AI agents rather than just their underlying language models. The study reveals that agent values diverge significantly from LLM values and are shaped more decisively by system harnesses and embedded skills than by traditional model alignment or prompt engineering approaches.