AIBearisharXiv – CS AI · 7h ago7/10
🧠
Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems
Researchers introduce SkillVetBench, a security benchmark for detecting malicious skills in open agent platforms, addressing supply-chain risks in extensible AI ecosystems. The framework combines semantic analysis of skill specifications with runtime execution monitoring in sandboxes, revealing that static-only defenses miss up to 89% of threats hidden in natural-language instructions and multi-component logic.