AIBullisharXiv – CS AI · 18h ago7/10
🧠
AgentTrust: A Self-Improving Trust Layer for AI-Agent Actions
AgentTrust v2 introduces a self-improving trust layer for AI agents that distinguishes between lexical (rule-detectable) and semantic (intent-dependent) threats. Using an LLM judge combined with a dual-store system, it achieves 83.6-85.2% accuracy on semantic threats while progressively distilling deterministic rules for lexical threats, demonstrating zero false-blocks across 45,000 test actions.