AINeutralarXiv โ CS AI ยท 5h ago1
๐ง
How Controllable Are Large Language Models? A Unified Evaluation across Behavioral Granularities
Researchers introduce SteerEval, a new benchmark for evaluating how controllable Large Language Models are across language features, sentiment, and personality domains. The study reveals that current steering methods often fail at finer-grained control levels, highlighting significant risks when deploying LLMs in socially sensitive applications.