AINeutralarXiv โ CS AI ยท 7h ago6/10
๐ง
SkillSieve: A Hierarchical Triage Framework for Detecting Malicious AI Agent Skills
Researchers introduced SkillSieve, a three-layer detection framework that identifies malicious AI agent skills in OpenClaw's ClawHub marketplace, where 13-26% of over 13,000 skills contain security vulnerabilities. The system combines regex/AST scanning, LLM-based analysis with parallel sub-tasks, and multi-LLM voting to achieve 0.800 F1 score at $0.006 per skill, significantly outperforming existing detection methods.