ASA: Backbone-Training-Free Representation Engineering for Tool-Calling Agents
Researchers introduce Activation Steering Adapter (ASA), a training-free method that improves LLM tool-calling reliability by intervening on mid-layer activations at inference time. The approach achieves significant performance gains on tool-use benchmarks without parameter updates, addressing a critical gap between what models internally represent and their actual behavior.
This research tackles a fundamental challenge in deploying language models as agents: the brittleness of tool-calling capabilities when facing distribution shifts or schema changes. Traditional approaches rely on prompt engineering, which lacks robustness, or fine-tuning, which introduces maintenance overhead and potential catastrophic forgetting. The authors discover that models actually encode tool necessity information reliably in their internal representations—the problem lies in translating that knowledge into action.
The Lazy Agent failure mode reveals a representation-behavior gap where models possess the requisite information to call tools but remain conservative in activating tool mode. This insight drives the development of ASA, which uses lightweight steering vectors applied at inference time to bridge this gap. By conditioning these vectors through a router and gating mechanism, the method amplifies true tool-calling intentions while suppressing false triggers.
The practical implications are substantial for AI practitioners building production agents. ASA requires only ~20KB of portable assets and zero weight updates, making it trivially deployable across different model instances and versions. Performance improvements are dramatic—F1 scores nearly tripling on strict tool-use metrics while false positives drop significantly—demonstrating that representation engineering offers an efficient middle ground between brittle prompting and expensive fine-tuning.
This work points toward a broader trend of inference-time optimization techniques that preserve model generalization while improving specialized capabilities. Future development likely involves extending similar steering approaches to other agent failure modes and exploring how representation steering scales across model sizes and architectures.
- →ASA achieves 0.18 to 0.50 F1 improvement on tool-use tasks without any parameter updates or training
- →The method identifies and exploits a representation-behavior gap where models encode tool necessity but fail to act on it
- →Only ~20KB of portable assets required, making deployment trivial across different model instances
- →False positive rate drops from 0.15 to 0.05, significantly improving practical usability of tool-calling agents
- →Training-free inference-time approach offers efficiency advantages over continual fine-tuning while maintaining robustness