- Synaptiks
- Posts
- Explainable artificial intelligence for mental health
Explainable artificial intelligence for mental health
through transparency and interpretability for understandability
Review of the paper : https://www.nature.com/articles/s41746-023-00751-9
a) Context and Problem to Solve
Mental health care is facing an important technological shift. Artificial intelligence (AI) and machine learning (ML) are increasingly used to support diagnosis, prognosis, and treatment decisions in psychiatry. These tools can identify patterns that human experts might miss, especially in complex datasets such as brain scans or electronic health records. However, a fundamental problem remains: these AI models often function as “black boxes.” They produce results—such as predicting whether a patient has a high risk of depression—but they don’t show how they reached that conclusion. In medicine, where trust, safety, and accountability are crucial, this lack of transparency is a major obstacle.
This concern has led to the growing popularity of the term “explainable AI” (XAI), which refers to models that can offer insights into their own decision-making processes. However, in practice, there is no agreement on what "explainable" really means. Different research papers and AI tools define it differently, or sometimes not at all. Some methods generate explanations that are only understandable to engineers, not clinicians. Others simply highlight which inputs the model found important, without connecting that information to the way doctors reason about diagnosis.
The article addresses this conceptual confusion by suggesting a shift in focus: rather than trying to define explainability in vague or inconsistent terms, researchers should aim to make AI understandable. The authors propose that understandability should be defined as the combination of two more concrete ideas: transparency (knowing what the model does and on what basis) and interpretability (being able to make sense of the model’s operations and outcomes). This approach is particularly relevant in mental health, where clinical data are often ambiguous, and diagnoses are based on a mix of symptoms, probabilities, and social or psychological factors. In such cases, trust in AI cannot be built without clarity and comprehension.
b) Methods Used in the Study
This paper is a conceptual and analytical review, not an empirical experiment. The authors begin by examining the recent literature on explainable AI in the context of mental health. They look at papers published between 2018 and 2022 that mention explainability in psychiatric applications. Out of these, they select 25 papers that met their criteria, including both original research and reviews. They analyze these papers to see how the term “explainability” is used, whether it is clearly defined, and whether the AI systems proposed are actually understandable by humans—especially clinicians.
The authors then identify patterns and shortcomings in how researchers use and justify their AI methods. They observe that many papers rely on established techniques like SHAP (Shapley Additive Explanations) or LIME (Local Interpretable Model-agnostic Explanations), but often without testing whether these methods are useful in clinical practice. Most studies do not explain how clinicians can or should use the AI output to make decisions.
Based on these insights, the authors develop a new conceptual framework: TIFU, which stands for Transparency and Interpretability For Understandability. This framework is grounded in the idea that AI in psychiatry must be understandable to humans if it is to be trusted and used safely. The paper defines and illustrates this framework in detail, using both hypothetical examples and references to real-world models.
c) Key Findings and Results
One of the central findings of the paper is that the term “explainable AI” is used inconsistently in mental health research. In most of the 15 original studies reviewed, the term is either undefined or is tied directly to a particular technical method. Only three of the studies made a serious effort to evaluate how their explanations could be used by clinicians. This suggests a significant gap between technical AI development and its clinical implementation.
In response, the authors offer precise definitions for the components of understandability. Transparency refers to how easily one can examine the input features of the model and understand how these features are used. Interpretability refers to whether a human can follow the model’s internal logic or calculations and relate them to clinical reasoning.
The TIFU framework is structured around the typical AI process: data inputs are transformed into features (this is the "f(x)" part of the model), and then those features are used to produce outputs (the "g(f(x))" part). For example, in a model predicting obsessive-compulsive disorder (OCD) using brain scans, the initial part of the model might extract features from specific brain regions. If these features are interpretable—say, they represent activity in areas known to be linked with OCD—then the model is transparent. If the final prediction process (the g function) is based on simple, understandable operations, such as a logistic regression, then the model is also interpretable.
The authors point out that deep learning models, which are often used for image analysis in neuroimaging, usually fail the test of understandability. These models are powerful but difficult to interpret. However, they can still be useful if used in a modular way: for example, the feature extraction can be done with deep learning, and the final decision can be made by a simpler, interpretable model. This separation allows for transparency in one part of the system and interpretability in another.
d) Conclusions and Main Implications
The authors conclude that in mental health care, where data are often ambiguous and decisions are high-stakes, AI systems must be understandable to be trusted and useful. Their review shows that most current research in XAI for psychiatry does not meet this standard. Many studies use technical tools that are difficult to interpret or fail to define what they mean by explainability. This lack of clarity risks undermining the potential of AI in clinical practice.
To address this, the authors propose a shift from the vague goal of explainability to the more concrete and achievable goal of understandability. They recommend that AI developers focus on building models that are transparent and interpretable, and that researchers evaluate their systems from the perspective of human users, not just technical performance.
Two specific recommendations are proposed. First, when using opaque AI models to explore data, researchers should use the findings to build simpler, interpretable models that retain only the most important features. This helps ensure that the final model is understandable. Second, when complex models are necessary—for example, when processing high-dimensional data like brain scans—they should be broken into modules. A complex feature extraction model can be used upstream, but the downstream decision-making should be handled by a simpler model that clinicians can understand.
By introducing the TIFU framework, the authors aim to provide a practical and principled way to guide the development of AI systems in psychiatry. Their approach emphasizes that true explainability is not just about technical tools—it’s about aligning AI systems with the way human clinicians think, reason, and make decisions.
Reply