y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#evaluation-standards News & Analysis

1 article tagged with #evaluation-standards. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles
AINeutralarXiv – CS AI · 10h ago7/10
🧠

Mental Health AI Safety Claims Must Preserve Temporal Evidence

Researchers argue that current mental health AI safety evaluations fail to detect clinically significant failures because they assess isolated responses rather than temporal patterns across conversations. The paper introduces Temporal Safety Non-Identifiability to formalize why sequence-dependent failures cannot be certified by turn-level evaluations, proposing SCOPE-MH as a new evaluation standard that preserves conversation history and cumulative effects.