y0news
← Feed
Back to feed
🧠 AI🟢 BullishImportance 6/10

CoHyDE: Iterative Co-Training of LLM Rewriter & Dense Encoder for Tool Retrieval

arXiv – CS AI|Vaishali Senthil, Ashutosh Hathidara, Sebastian Schreiber|
🤖AI Summary

Researchers introduce CoHyDE, an iterative co-training method that jointly optimizes a dense encoder and LLM rewriter to improve tool retrieval for AI agents. The approach outperforms single-component baselines by 2.5-8 percentage points on standard and vague queries, addressing the fundamental challenge of bridging colloquial user language with technical API vocabularies.

Analysis

CoHyDE addresses a critical infrastructure challenge in LLM agent development: efficiently retrieving the right tool from massive API catalogs when user queries lack technical precision. Current solutions fail predictably—fine-tuned encoders struggle with informal language while frozen LLM rewriters generate descriptions disconnected from actual catalog content. This research demonstrates that coupling both components in a feedback loop solves complementary weaknesses more effectively than either approach alone.

The technical contribution centers on iterative alignment: the encoder learns from hypothetical descriptions the rewriter generates, while the rewriter receives preference signals derived from encoder retrieval rankings. This mutual reinforcement pattern reflects broader trends in machine learning toward co-training and multi-objective optimization, where different model components train against each other's outputs rather than static gold standards.

For the AI agent ecosystem, robust tool retrieval directly impacts reasoning quality and task completion rates. Agents that reliably access appropriate APIs execute user intentions more accurately, reducing failure modes that currently plague production deployments. The 6.3 percentage point improvement on deliberately vague queries suggests CoHyDE specifically strengthens agents' ability to handle real-world ambiguous requests.

The ToolBench evaluation on 10,000 tools provides realistic scale, though broader generalization across different API catalogs and domains remains untested. Adoption depends on whether teams can replicate results with their proprietary tool collections and whether the computational cost of iterative training justifies the retrieval improvements in deployed systems.

Key Takeaways
  • CoHyDE achieves 2.5-8pp NDCG improvements over single baselines by co-training encoder and LLM rewriter iteratively.
  • The method specifically excels on vague/underspecified queries, gaining up to 8pp on the hardest ambiguity tier.
  • Ablations confirm co-training is essential—either component alone fails to match combined performance.
  • Approach bridges the vocabulary gap between colloquial user language and technical API catalogs through mutual feedback.
  • Results demonstrate scalability on ~10k real-world tools from ToolBench, but generalization to other domains remains unexplored.
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles