ContextualSHAP : Enhancing SHAP Explanations Through Contextual Language Generation

Summary

This paper introduces ContextualSHAP, a Python package that extends the SHAP XAI method by integrating it with Large Language Models (LLMs), specifically OpenAI's GPT, to generate contextualized textual explanations. Addressing SHAP's limitation in providing meaningful context for non-technical users, the tool uses user-defined parameters to tailor explanations. Preliminary user evaluations in a healthcare case study suggest that these combined visual-textual explanations are perceived as more understandable and contextually appropriate than visual-only outputs.

Medical Relevance

This research is highly relevant to medicine as it aims to make complex AI model decisions more interpretable and understandable for healthcare professionals. By providing contextual explanations, it can enhance trust and facilitate the responsible deployment of AI in high-stakes medical domains like diagnostics and treatment planning.

AI Health Application

This research enhances the explainability (XAI) of AI models used in healthcare by providing contextualized, natural language explanations alongside traditional visualizations. This is critical for medical AI applications where clinicians, patients, or administrators need to understand why an AI model made a particular prediction or recommendation (e.g., diagnosis, treatment plan, risk assessment). By making AI explanations more accessible and trustworthy, it facilitates safer and more effective integration of AI into clinical practice and healthcare decision-making, improving user acceptance and supporting more informed actions based on AI outputs.

Key Points

  • SHAP, while prominent for feature importance visualization, often lacks contextual explanations that are meaningful for non-technical end-users.
  • The proposed solution, ContextualSHAP, is a Python package that integrates SHAP with an LLM (OpenAI's GPT) to generate rich, contextualized textual explanations.
  • Explanations are customized using user-defined parameters, including feature aliases, descriptions, and additional background, allowing tailoring to model context and user perspective.
  • The effectiveness was evaluated through a healthcare-related case study, involving real end-users.
  • User evaluations, utilizing Likert-scale surveys and follow-up interviews, indicated improved perceived understandability and contextual appropriateness of the generated explanations.
  • The findings suggest that combining SHAP's visual outputs with LLM-generated contextual text can lead to more user-friendly and trustworthy model explanations.
  • The results are considered preliminary, highlighting the need for further validation.

Methodology

The methodology involved developing a Python package (ContextualSHAP) that integrates the SHAP explanation framework with OpenAI's GPT LLM. This integration enables the generation of natural language explanations, guided by user-defined parameters (e.g., feature aliases, descriptions, background information). The package's effectiveness was evaluated through a healthcare-related case study, employing user evaluations via Likert-scale surveys and follow-up interviews with real end-users to assess perceived understandability and contextual appropriateness.

Key Findings

The primary findings indicate that the LLM-generated contextual explanations, when combined with SHAP visualizations, were perceived by end-users as significantly more understandable and contextually appropriate compared to visual-only SHAP outputs. This suggests that the proposed integration enhances the interpretability of AI model predictions for non-technical audiences.

Clinical Impact

This approach has the potential to significantly improve the adoption and trustworthiness of AI systems in clinical settings. By translating complex AI decisions into understandable, context-rich narratives, ContextualSHAP can empower clinicians, who often lack a technical background, to better comprehend model outputs for diagnoses, prognoses, or treatment recommendations, thereby supporting more informed and confident clinical decision-making.

Limitations

The authors explicitly state that the findings are preliminary. This suggests that further, more extensive validation studies are needed to confirm the generalizability and robustness of the results across diverse AI models, medical domains, and user groups.

Future Directions

While not explicitly detailed as 'future directions' in the abstract, the preliminary nature of the findings implicitly calls for more comprehensive evaluations to solidify the reported benefits. Further research could focus on validating the long-term impact on user trust, decision-making accuracy, and efficiency in real-world clinical workflows, as well as exploring its applicability across a wider range of medical AI tasks and different LLM architectures.

Medical Domains

Clinical Decision Support Medical Diagnostics Predictive Analytics in Healthcare Medical Imaging Interpretation

Keywords

Explainable AI SHAP Large Language Model Contextual Explanations Healthcare AI Model Interpretability User Evaluation GPT

Abstract

Explainable Artificial Intelligence (XAI) has become an increasingly important area of research, particularly as machine learning models are deployed in high-stakes domains. Among various XAI approaches, SHAP (SHapley Additive exPlanations) has gained prominence due to its ability to provide both global and local explanations across different machine learning models. While SHAP effectively visualizes feature importance, it often lacks contextual explanations that are meaningful for end-users, especially those without technical backgrounds. To address this gap, we propose a Python package that extends SHAP by integrating it with a large language model (LLM), specifically OpenAI's GPT, to generate contextualized textual explanations. This integration is guided by user-defined parameters (such as feature aliases, descriptions, and additional background) to tailor the explanation to both the model context and the user perspective. We hypothesize that this enhancement can improve the perceived understandability of SHAP explanations. To evaluate the effectiveness of the proposed package, we applied it in a healthcare-related case study and conducted user evaluations involving real end-users. The results, based on Likert-scale surveys and follow-up interviews, indicate that the generated explanations were perceived as more understandable and contextually appropriate compared to visual-only outputs. While the findings are preliminary, they suggest that combining visualization with contextualized text may support more user-friendly and trustworthy model explanations.

Comments

This paper was accepted and presented at the 7th World Symposium on Software Engineering (WSSE) 2025 on 25 October 2025 in Okayama, Japan, and is currently awaiting publication