AINeutralarXiv โ CS AI ยท 4h ago6/10
๐ง
The A-R Behavioral Space: Execution-Level Profiling of Tool-Using Language Model Agents in Organizational Deployment
Researchers introduce a new behavioral measurement framework for tool-augmented language models deployed in organizations, using a two-dimensional Action Rate and Refusal Signal space to profile how LLM agents execute tasks under different autonomy configurations and risk contexts. The approach prioritizes execution-layer characterization over aggregate safety scoring, revealing that reflection-based scaffolding systematically shifts agent behavior in high-risk scenarios.