y0news
← Feed
Back to feed
🧠 AI NeutralImportance 6/10

STARS: Skill-Triggered Audit for Request-Conditioned Invocation Safety in Agent Systems

arXiv – CS AI|Guijia Zhang, Shu Yang, Xilin Gong, Di Wang|
🤖AI Summary

Researchers introduce STARS, a framework for continuously auditing AI agent skill invocations in real-time by combining static capability analysis with request-conditioned risk modeling. The approach demonstrates improved detection of prompt injection attacks compared to static baselines, though remains most valuable as a triage layer rather than a complete replacement for pre-deployment screening.

Analysis

The rapid proliferation of autonomous AI agents equipped with external tools and skills has created a critical security challenge: determining whether a particular tool invocation is safe within its specific operational context. Traditional static auditing examines capability surfaces before deployment but cannot account for dynamic risk factors emerging at runtime. STARS addresses this gap by formulating skill invocation safety as a continuous risk-estimation problem, enabling ranking and prioritization of potentially problematic actions before execution occurs.

The framework combines three components: a static capability prior establishing baseline risk profiles, a request-conditioned invocation risk model that evaluates specific user requests against tool behavior, and a calibrated risk-fusion policy that synthesizes these signals. The researchers constructed SIA-Bench, a benchmark dataset of 3,000 labeled invocation records including indirect prompt injection scenarios, to evaluate their approach against existing methods.

Results show meaningful but modest improvements in detecting high-risk invocations, with calibrated fusion achieving 0.439 AUPRC on indirect prompt injection attacks versus 0.405 for contextual-only and 0.380 for static-only baselines. However, performance gains narrow on standard in-distribution test sets, indicating that static priors retain substantial value for routine scenarios. This suggests request-conditioned auditing functions optimally as an intermediate risk-scoring mechanism within a layered defense strategy rather than as a complete safety replacement.

For the AI agent ecosystem, this work establishes that runtime context matters significantly for safety decisions, but also validates that no single approach eliminates the need for comprehensive multi-stage screening. Organizations deploying autonomous agents should implement complementary static and dynamic auditing rather than relying exclusively on either method.

Key Takeaways
  • STARS combines static analysis with request-conditioned risk modeling to audit AI agent skill invocations at runtime
  • Contextual auditing improves prompt injection detection but gains diminish on standard test cases, validating multi-stage screening approaches
  • SIA-Bench benchmark provides 3,000 labeled invocation records for evaluating agent safety systems
  • Dynamic risk-scoring serves best as a triage and prioritization layer alongside static pre-deployment screening
  • Calibration of risk fusion policies proves critical for practical deployment in safety-sensitive applications
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles