y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#decision-evidence News & Analysis

1 article tagged with #decision-evidence. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AINeutralarXiv – CS AI · 8h ago6/10
🧠

DEMM-Bench: A Cross-Regime Benchmark for Agent-Runtime Governance-Evidence Sufficiency

DEMM-Bench introduces a benchmark framework for evaluating whether evidence records in agent-runtime systems sufficiently answer governance questions about specific decisions. Using the Decision Evidence Maturity Model, researchers tested 64 cases across eight evidence regimes and found that existing baselines overclaim sufficiency in 50-75% of cases, while a property-level scorer achieved 56.25% accuracy with zero overclaims.