AIBullisharXiv โ CS AI ยท 14h ago7/10
๐ง
Disambiguation-Centric Finetuning Makes Enterprise Tool-Calling LLMs More Realistic and Less Risky
Researchers introduce DiaFORGE, a three-stage framework for training LLMs to reliably invoke enterprise APIs by focusing on disambiguation between similar tools and underspecified arguments. Fine-tuned models achieved 27-49 percentage points higher tool-invocation success than GPT-4o and Claude-3.5-Sonnet, with an open corpus of 5,000 production-grade API specifications released for further research.
๐ง GPT-4๐ง Claude