y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#agent-skills News & Analysis

11 articles tagged with #agent-skills. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

11 articles
AIBullisharXiv – CS AI · 2d ago7/10
🧠

ANDES: Agent Native Data Evolving Synthesis Tool for Autonomous Instruction Alignment

Researchers introduce ANDES, a framework that enables AI agents to autonomously generate high-quality training data for LLM alignment by abstracting complex data-gathering tasks into a manageable agent skill. The system uses a self-evolving World Tree routing mechanism to help agents navigate noisy web environments and achieve state-of-the-art performance on alignment benchmarks despite computational constraints.

AINeutralarXiv – CS AI · 2d ago7/10
🧠

ClawHub Security Signals: When VirusTotal, Static Analysis, and SkillSpector Disagree

Researchers released ClawHub Security Signals, a dataset of 67,453 AI agent skills analyzed by three security scanners, revealing significant disagreement among detection methods. Only 0.69% of skills were flagged by all three scanners, indicating that single-scanner verdicts are insufficient for securing AI agent ecosystems and requiring layered security governance instead.

🏢 Nvidia
AIBearisharXiv – CS AI · May 287/10
🧠

Technical Report: Exploring the Emerging Threats of the Agent Skill Ecosystem

Researchers identified 76 confirmed malicious AI agent skills across major marketplaces, with 13.4% of 3,984 analyzed skills containing critical security vulnerabilities. The findings highlight urgent risks as AI agents gain access to sensitive credentials and systems, with malicious payloads still publicly available on platforms like clawhub.ai.

AIBullisharXiv – CS AI · Apr 207/10
🧠

Bilevel Optimization of Agent Skills via Monte Carlo Tree Search

Researchers propose a bilevel optimization framework using Monte Carlo Tree Search to systematically improve LLM agent skills—structured collections of instructions, tools, and resources. The framework optimizes both skill structure and component content simultaneously, demonstrating performance improvements on Operations Research tasks and addressing a previously unsolved challenge in agent design optimization.

AIBearisharXiv – CS AI · Apr 67/10
🧠

Towards Secure Agent Skills: Architecture, Threat Taxonomy, and Security Analysis

Researchers conducted the first comprehensive security analysis of Agent Skills, an emerging standard for LLM-based agents to acquire domain expertise. The study identified significant structural vulnerabilities across the framework's lifecycle, including lack of data-instruction boundaries and insufficient security review processes.

AI × CryptoBullishBlockonomi · Apr 47/10
🤖

Solana Foundation Launches Agent Skills to Connect AI Tools With On-Chain Operations

Solana Foundation launched Agent Skills, a developer toolkit that enables one-line integration of AI tools with blockchain operations. The platform features over 60 community-built skills with prebuilt security and compatibility components, supporting DeFi, payments, and infrastructure functions across major platforms like JupiterExchange and Raydium.

$SOL
AINeutralarXiv – CS AI · May 46/10
🧠

Semia: Auditing Agent Skills via Constraint-Guided Representation Synthesis

Semia is a static auditor for LLM-driven agent skills that uses constraint-guided synthesis to analyze security risks in hybrid code-and-prose configurations. Testing 13,728 real-world skills from public marketplaces, Semia identified critical semantic vulnerabilities in over half and achieved 97.7% recall, significantly outperforming existing security tools.

AINeutralarXiv – CS AI · May 46/10
🧠

Skills as Verifiable Artifacts: A Trust Schema and a Biconditional Correctness Criterion for Human-in-the-Loop Agent Runtimes

Researchers propose a trust framework for AI agent skills—reusable code packages that extend language models—treating them as untrusted by default until verified. The approach introduces verification levels, capability gates, and correctness criteria to enable sustainable human-in-the-loop oversight without operational bottlenecks.

AINeutralarXiv – CS AI · Mar 166/10
🧠

SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks

SkillsBench introduces a new benchmark to evaluate Agent Skills - structured packages of procedural knowledge that enhance LLM agents. Testing across 86 tasks and 11 domains shows curated Skills improve performance by 16.2 percentage points on average, while self-generated Skills provide no benefit.

AIBullishHugging Face Blog · Mar 66/10
🧠

Conversational LLM Evaluations in Minutes with NVIDIA NeMo Evaluator Agent Skills

NVIDIA has released NeMo Evaluator Agent Skills, a tool that enables rapid evaluation of conversational large language models in minutes. This development streamlines the testing and validation process for LLM applications, potentially accelerating AI development workflows.

🏢 Nvidia
AINeutralarXiv – CS AI · Mar 37/106
🧠

Formal Analysis and Supply Chain Security for Agentic AI Skills

Researchers developed SkillFortify, the first formal analysis framework for securing AI agent skill supply chains, addressing critical vulnerabilities exposed by attacks like ClawHavoc that infiltrated over 1,200 malicious skills. The framework achieved 96.95% F1 score with 100% precision and zero false positives in detecting malicious AI agent skills.