Inform, Coach, Relate, Listen: Auditing LLM Caregiving Support Roles
Researchers audited how large language models change their safety profiles when deployed in different caregiving support roles, testing GPT-4o-mini, Llama-3.1-8B, and MedGemma across 5,000 real dementia-care queries. The study found that directive, information-focused roles increase interactional risks despite being perceived as more helpful, revealing a quality-safety tradeoff that challenges current LLM safety evaluation practices.
This research addresses a critical gap in AI safety evaluation by examining how context shapes model behavior beyond generic testing conditions. Traditional safety audits rely on standardized prompts, but real-world deployment often involves nuanced conversational roles where users seek emotional support alongside information. The study operationalized four support roles—Inform, Coach, Relate, and Listen—grounded in established social support theory, providing a systematic framework for understanding contextual risk variation.
The findings carry significant implications for AI deployment in healthcare and social support contexts. Models rated as most helpful and trustworthy (directive, information-heavy roles) simultaneously exhibited elevated interactional risks, suggesting users may not perceive safety hazards in authoritative guidance. This quality-safety tension is particularly concerning in caregiving contexts where vulnerable populations depend on AI systems for decision-making support.
For developers and organizations deploying LLMs in health-adjacent applications, this research demonstrates that role-specific safety evaluation is essential before production release. The release of 90,000 annotated responses creates a valuable resource for developing more context-aware safety guardrails. This work indicates that one-size-fits-all safety standards are insufficient for conversational AI systems operating in varied social contexts.
Looking forward, the research highlights the need for role-conditioned safety frameworks and human-in-the-loop evaluation protocols for sensitive domains. Organizations should expect increased scrutiny around safety practices in caregiving AI, and regulators may demand contextual risk assessments alongside generic benchmarks.
- →LLM safety profiles vary significantly based on assigned support roles, not just model architecture or prompts.
- →More directive, information-focused roles increase interactional risks while appearing more helpful to users.
- →Traditional safety evaluations miss context-specific vulnerabilities present in real-world conversational support scenarios.
- →90,000 annotated support-role responses provide a foundation for developing safer caregiving AI systems.
- →Healthcare and social support AI deployments require role-conditioned safety auditing, not generic benchmarking.