AIBullisharXiv – CS AI · 8h ago7/10
🧠
Training the Orchestrator: A Supervised Approach to End-to-End PDDL Planning with LLM Agents
Researchers introduce HALO, a trained orchestrator system that reduces LLM API costs by 45x compared to GPT-4-mini while matching performance on PDDL planning tasks. By leveraging verifier-certified trajectories as direct supervision rather than prompting frontier models at every step, HALO achieves significant cost efficiency improvements across multiple planning benchmarks.
🧠 GPT-5🧠 Gemini