y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#tool-retrieval News & Analysis

2 articles tagged with #tool-retrieval. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles
AINeutralarXiv โ€“ CS AI ยท Feb 277/107
๐Ÿง 

LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?

LiveMCPBench introduces the first large-scale benchmark evaluating AI agents' ability to navigate real-world tasks using Model Context Protocol (MCP) tools across multiple servers. The benchmark reveals significant performance gaps, with top model Claude-Sonnet-4 achieving 78.95% success while most models only reach 30-50%, identifying tool retrieval as the primary bottleneck.

$OCEAN
AINeutralarXiv โ€“ CS AI ยท Mar 27/1020
๐Ÿง 

HumanMCP: A Human-Like Query Dataset for Evaluating MCP Tool Retrieval Performance

Researchers have released HumanMCP, the first large-scale dataset designed to evaluate tool retrieval performance in Model Context Protocol (MCP) servers. The dataset addresses a critical gap by providing realistic, human-like queries paired with 2,800 tools across 308 MCP servers, improving upon existing benchmarks that lack authentic user interaction patterns.