Researchers introduce NTILC, a neural framework that replaces in-context tool registry lookups with learned latent retrieval for language model agents. The approach reduces context token consumption by over 95% and inference latency by up to 74% while maintaining selection accuracy through signature-aware optimization.
NTILC addresses a fundamental scaling problem in agentic AI systems where tool registries grow faster than prompt context windows can accommodate. Traditional in-context tool calling embeds full specifications directly in prompts, creating a linear cost relationship that degrades performance as registries expand. This research leverages learned embeddings and external retrieval to circumvent the problem, mapping both user intent and tool schemas into a shared latent space for efficient selection.
The technical innovation centers on a composite loss function combining Circle Loss with Functional Margin Loss, enforcing separation between semantically similar but functionally incompatible tools. This constraint-aware approach recognizes that similarity alone insufficient—tools must satisfy argument schemas, type compatibility, and return type requirements. The signature-aware conditioning prevents hallucinated or mismatched function calls that plague naive semantic matching.
For the AI infrastructure ecosystem, NTILC has immediate implications. Developers building tool-calling agents can now support vastly larger registries without proportional context overhead, directly improving latency and reducing inference costs. The 95% context reduction translates to measurable operational savings at scale. The framework particularly benefits systems managing heterogeneous APIs where interference from irrelevant tools currently causes selection errors.
Looking ahead, the success of learned retrieval over in-context lookup may influence broader design patterns in agent architectures. Further research likely explores integration with knowledge graphs or dynamic registry updates. The work suggests that as agentic systems mature, external retrieval will dominate over context-based approaches for tool discovery.
- →NTILC replaces prompt-based tool registries with learned retrieval, reducing context usage by 95%
- →Signature-aware loss functions prevent incompatible tool selection despite semantic similarity
- →Inference latency decreases by up to 74%, reducing operational costs for deployed agents
- →The approach enables scalable tool registry management without degrading selection accuracy
- →External retrieval architecture may become standard practice for large-scale agentic AI systems