AINeutralarXiv โ CS AI ยท 16h ago6/10
๐ง
FinRetrieval: A Benchmark for Financial Data Retrieval by AI Agents
Researchers introduced FinRetrieval, a benchmark testing AI agents' ability to retrieve financial data, evaluating 14 configurations across major providers. The study found that tool availability dramatically impacts performance, with Claude Opus achieving 90.8% accuracy using structured APIs versus only 19.8% with web search alone.
๐ข OpenAI๐ข Anthropic๐ง Claude