y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#policy-reasoning News & Analysis

1 article tagged with #policy-reasoning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AINeutralarXiv – CS AI · 9h ago6/10
🧠

PSEBench: A Controllable and Verifiable Benchmark for Evaluating LLMs in Patient Safety Event Triage

Researchers introduced PSEBench, a 5,074-case benchmark dataset designed to evaluate large language models on patient safety event triage—the critical task of determining whether clinical incidents require reporting under regulatory policy. The methodology uses policy-grounded clause cards and verification mechanisms to ensure reliable evaluation of LLM reasoning, information-seeking behavior, and appropriate abstention in ambiguous cases.