Governed Metaprogramming for Intelligent Systems: Reclassifying Eval as a Governed Effec
Researchers propose governed metaprogramming, a language design framework that reclassifies the eval function from an unrestricted primitive into a controlled effect subject to governance and inspection. The approach aims to address security and authority risks in AI systems that synthesize executable code at runtime, with implementation demonstrated in MashinTalk, a DSL for AI workflows.
The paper addresses a fundamental challenge in modern AI systems: as language models and autonomous agents increasingly generate and execute code at runtime, the traditional treatment of eval as a simple language primitive becomes a security liability. The authors frame code materialization as an authority amplification problem—converting symbolic representations into executable permissions requires the same governance applied to other privileged operations.
Historically, homoiconic languages like Lisp treated eval as unfettered access to the execution environment. This design assumes human programmers control what code runs. AI systems invert this assumption; humans specify high-level goals while systems generate implementation details. Without governance layers, a model could synthesize malicious operations or exceed intended capability boundaries. The governed metaprogramming framework separates pure form manipulation (analysis of code structure) from materialization (actual execution), enabling policy validation and resource estimation before code runs.
The implementation in MashinTalk demonstrates practical applicability within the Erlang/BEAM ecosystem and validates against 454 machine-checked theorems, suggesting the approach is rigorous rather than theoretical. For developers building AI agents and code-generation systems, this framework provides mechanisms to enforce sandboxing and compliance without eliminating dynamic behavior entirely.
The work signals growing recognition that safety-critical AI systems require architectural constraints at the language level. Organizations deploying code-generating AI need execution environments where synthesis and materialization are separated. Future development hinges on whether developers adopt governed materialization patterns and whether performance overhead remains acceptable at scale.
- →Eval transitions from unrestricted primitive to governed effect subject to policy inspection and capability analysis before execution
- →Separates pure form evaluation from materialization enables security boundaries in AI systems that generate executable code
- →MashinTalk implementation demonstrates practical feasibility with validation against 454 machine-checked theorems
- →Addresses emerging security risks as LLMs and agents increasingly synthesize and execute code dynamically at runtime
- →Framework enables compliance verification and resource estimation before untrusted code materialization