NTILC: Neural Tool Invocation via Learned Compression
Researchers introduce NTILC, a neural framework that replaces in-context tool registry lookups with learned latent retrieval for language model agents. The approach reduces context token consumption by over 95% and inference latency by up to 74% while maintaining selection accuracy through signature-aware optimization.