One Life to Learn: Inferring Symbolic World Models for Stochastic Environments from Unguided Exploration
Researchers introduce OneLife, a framework for learning symbolic world models from minimal unguided exploration in complex, stochastic environments. The approach uses conditionally-activated programmatic laws within a probabilistic framework and demonstrates superior performance on 16 of 23 test scenarios, advancing autonomous construction of world models for unknown environments.
OneLife represents a meaningful advance in artificial intelligence's ability to autonomously understand and model complex environments with minimal data and guidance. Traditional symbolic world modeling has relied on deterministic settings, abundant interaction data, and human supervision—limitations that severely restrict real-world applicability. This research addresses those constraints by developing a framework where an agent learns environmental dynamics from a single exploratory session in a stochastic, hostile setting, mirroring conditions agents might face in genuine discovery scenarios.
The technical innovation centers on conditionally-activated programmatic laws that form a dynamic computation graph. Rather than processing all rules equally when predicting state transitions, the framework routes computation only through relevant laws. This design elegantly solves scaling challenges inherent in complex, hierarchical state spaces and enables learning of stochastic dynamics despite sparse rule activation patterns. The introduction of Crafter-OO, a structured object-oriented environment, provides rigorous evaluation infrastructure with state ranking and fidelity metrics.
For the broader AI community, OneLife demonstrates that symbolic reasoning and program synthesis can extend beyond controlled laboratory settings. Success on planning tasks—where simulated rollouts identified superior strategies—suggests practical applications in robotics, game AI, and autonomous systems where exploration data is costly and human guidance unavailable. The framework's emphasis on minimal interaction aligns with efficiency demands in real-world deployments.
Future research directions include scaling to higher-dimensional state spaces, handling more complex stochastic phenomena, and integrating learned models into planning algorithms. The work establishes methodological foundations for machines to autonomously discover environmental rules, a prerequisite for truly autonomous agents operating in unknown domains.
- →OneLife learns world dynamics from single unguided exploration sessions in complex stochastic environments, addressing limitations of prior deterministic-focused approaches
- →Conditionally-activated programmatic laws create dynamic computation graphs that scale efficiently by routing inference only through relevant rules
- →Framework outperforms baselines on 16 of 23 test scenarios and successfully uses learned models for strategic planning
- →New evaluation protocol measures state ranking and fidelity, providing rigorous metrics for symbolic world model quality
- →Research establishes foundation for autonomous construction of programmatic models in unknown, complex environments