ASH: Agents that Self-Hone via Embodied Learning
Researchers introduce ASH, an agentic system that learns embodied policies from unlabeled internet video without reward shaping or expert demonstration. Through a self-improvement loop using Inverse Dynamics Models, ASH achieves sustained progression on long-horizon tasks in Pokemon Emerald and Legend of Zelda, significantly outperforming baseline approaches.