y0news
AnalyticsDigestsSourcesRSSAICrypto
#ai-judge1 article
1 articles
AINeutralarXiv โ€“ CS AI ยท 4h ago7/10
๐Ÿง 

Using LLM-as-a-Judge/Jury to Advance Scalable, Clinically-Validated Safety Evaluations of Model Responses to Users Demonstrating Psychosis

Researchers developed a scalable method using LLMs as judges to evaluate AI safety for users with psychosis, finding strong alignment with human clinical consensus. The study addresses critical risks of LLMs potentially reinforcing delusions in vulnerable mental health populations through automated safety assessment.