Tell Me a Story! Narrative-Driven XAI with Large Language Models
Researchers introduce XAIstories, a framework that uses Large Language Models to convert complex AI explanations (SHAP values and counterfactual explanations) into human-readable narratives. User studies show over 90% of general audiences find these AI-generated stories convincing, with data scientists viewing them as valuable for explaining AI decisions to non-technical stakeholders.
XAIstories addresses a critical gap in explainable AI adoption: the comprehension barrier. While SHAP values and counterfactual explanations are technically rigorous, they remain inaccessible to most non-expert users, limiting the practical utility of XAI in real-world applications. By bridging this gap with narrative explanations, the research tackles a persistent challenge in AI democratization.
The broader context reflects growing regulatory and commercial pressure for AI transparency. As machine learning models proliferate in high-stakes domains like credit scoring and healthcare, stakeholders demand interpretability. Traditional XAI methods solve this theoretically but fail practically—explanations that users cannot understand provide no actual value. This research demonstrates that LLMs can serve as effective translation layers between technical explanations and human comprehension.
The impact extends across multiple constituencies. For enterprises, XAIstories could reduce compliance friction in regulated industries by making AI decisions more defensible to auditors and regulators. For data scientists, it offers a scalable tool for stakeholder communication without manual narrative crafting. For end-users affected by AI decisions, clearer explanations could enable better-informed challenges to unfavorable predictions, particularly in lending or hiring contexts.
The tenfold speed improvement in narrative generation (CFstories) indicates production viability. However, questions remain about narrative hallucination, consistency across similar cases, and whether LLM-generated stories might introduce new forms of bias or obscure important nuances in explanations. Future work should examine long-term user trust and whether improved understanding translates to better real-world decision-making outcomes.
- →Over 90% of general audiences find LLM-generated narratives of AI explanations convincing, addressing the comprehension gap in explainable AI.
- →Data scientists report 83% likelihood of adopting SHAPstories for communicating AI decisions to non-technical stakeholders.
- →Users correctly answered comprehension questions significantly more often with narrative explanations than with raw SHAP values alone.
- →CFstories achieve a tenfold speed improvement over manual narrative creation while maintaining comparable convincingness to human-written explanations.
- →XAIstories has potential applications in regulated domains like credit scoring where AI decisions require human-understandable justification.