A Survey on Large Language Model-Based Game Agents
A comprehensive survey examines Large Language Model-based game agents (LLMGAs) as testbeds for artificial general intelligence capabilities. The research synthesizes LLM game agent design through a unified architecture covering memory, reasoning, and perception-action interfaces at single-agent levels, plus communication protocols and organizational models for multi-agent coordination across six major game genres.
This survey represents a significant shift in how the AI research community evaluates large language models beyond traditional benchmarks. Game environments serve as particularly valuable testing grounds because they compress many real-world complexities—dynamic decision-making, resource management, social coordination, and goal formation—into controllable, measurable settings. The integration of LLMs into game agents reveals capabilities in reasoning, adaptability, and generalization that standard NLP evaluations often miss.
The research reflects broader trends in AI development toward embodied intelligence and multimodal systems. Rather than treating language models as text-only systems, researchers increasingly deploy them as decision-making engines that perceive, strategize, and act in complex environments. This approach addresses a fundamental limitation of pure language benchmarks: they don't test whether systems can translate understanding into effective real-world action under uncertainty and time pressure.
For the AI industry, this framework has practical implications for developing more capable autonomous agents. Game genres present distinct challenge profiles—action games demand low-latency control while sandbox worlds require open-ended planning—enabling researchers to systematically improve agent capabilities. Companies developing autonomous systems, robotics, or interactive AI can directly apply these architectural insights to production environments.
Looking ahead, the field will likely focus on scaling multi-agent coordination in games with thousands of participants and improving perception-action loops in real-time scenarios. The curated paper collection signals growing research velocity, suggesting this domain will become increasingly central to AI capability evaluation in coming years.
- →LLM-based game agents offer structured testbeds for evaluating AGI-relevant capabilities like reasoning, memory, and adaptability
- →The unified architecture distinguishes single-agent components (memory, reasoning, interfaces) from multi-agent systems (communication, coordination)
- →Six major game genres map to distinct agent requirements, enabling systematic capability assessment across complexity levels
- →Game environments compress real-world complexity including dynamic decisions, resource constraints, and social behaviors into measurable systems
- →Findings have direct applications for autonomous systems, robotics, and interactive AI development beyond gaming