AIBullisharXiv – CS AI · 7h ago7/10
🧠
ISE: An Execution-Grounded Recipe for Multi-Turn OS-Agent Trajectories
Researchers introduce ISE (Intent → Simulate → Execute), a three-stage framework for training OS agents that generates 43,956 structured intents and 23,132 multi-turn trajectories with live execution validation. Fine-tuning Qwen3-8B on this dataset achieves 37.7% pass@1 on ClawEval, outperforming GPT-4o zero-shot and the larger Qwen3-32B model, demonstrating that high-quality synthetic data design can overcome model scale limitations.
🧠 GPT-4