AINeutralarXiv – CS AI · 15h ago6/10
🧠
TowerMind: A Tower Defence Game Learning Environment and Benchmark for LLM as Agents
Researchers introduce TowerMind, a lightweight tower defense game environment designed to evaluate Large Language Models as autonomous agents. The benchmark tests LLMs' capabilities in strategic planning and real-time decision-making while revealing significant performance gaps compared to human experts and highlighting key limitations in model reasoning.