AINeutralarXiv – CS AI · 7h ago6/10
🧠
Agentic Transformers Provably Learn to Search via Reinforcement Learning
Researchers demonstrate that transformer-based AI agents can learn tree-search capabilities through reinforcement learning without explicit instruction, with attention heads specializing to track action history and detect failures. The findings reveal how agents develop depth-first search mechanisms during training and generalize to deeper problems than they trained on, advancing theoretical understanding of how language models acquire reasoning abilities.