
Introduction:
AI in medical diagnosis is a revolution. Further, explainable AI makes it reliable.
Generally, we find AI predictions as black boxes. However, now we have explainable AI.
Earlier, we learned Hypothyroid diagnosis using neural networks here. Further, we will understand explainable AI (XAI) using the same example.
We would find important features.For example, which one contributed to the model prediction.
Especially, using the same example. Therefore, we would use the same pre-trained model. Also, You can refer to this project at GitHub here.
Goal:
In this tutorial, we would use the DALEX library to explain the trained model.
Above all, this explainable AI (XAI) is model agnostic. For example, it can work on linear models. Likewise, it can work on non-linear models.
Together with, AI in medical diagnosis we explain use of XAI.
Glossary:
- Deep Learning: It is a machine learning technique. In addition, it enables learning through the use of neural networks that mimic the human brain.
- Hypothyroid: We can call it underactive thyroid. In particular, if a person’s body doesn’t produce enough thyroid hormones.
- SHAPLEY values: It measures the contribution of each feature based on Game Theory.
Prerequisites:
- Programming knowledge in Python.
- Basic knowledge of Jupyter Notebook, Deep Learning, Keras.
How can we explain an AI in medical diagnosis with a pre-trained Model?
For the most part, we can explain pre-trained models using libraries like DALEX and SHAP.
Further, they internally use various concepts like game theory, additive modelling, partial dependence.
To begin with, We can explain model with local interpretation or global interpretation.
In brief, global interpretations are for the whole dataset. Similarly, local interpretation means prediction done for a single instance of a dataset.
Though, we should prefer global interpretations in general. On the contrary, we can consider local interpretation as a postmortem. For example, we get a Hypothyroid prediction for person A. But, we have different opinions from healthcare experts for the same person A. Undoubtedly, we need to find out the reason.
In particular, we are using DALEX in this tutorial.
Step 1: Install DALEX library as a python package.
!pip install dalex |
Step 2. Load data and perform preprocessing.
We are preprocessing the data. However, I am not explaining the steps in detail here. Still, you can refer to them here.
# Load dataset from csv using pandas | |
dataset = pd.read_csv('data/hypothyroid.csv') | |
dataset.head() | |
# Renaming the first column as target | |
dataset = dataset.rename(columns = {dataset.columns[0]:"target"}) | |
dataset["target"] = dataset["target"].map({"negative":0,"hypothyroid":1}) | |
# Replacing the categorical values into binary values | |
dataset = dataset.replace({'f':0,'t':1, 'y':1, 'n':0, 'M':0, 'F':1}) | |
# Replacing ? into NaN values | |
dataset.replace(to_replace='?', inplace=True, value=np.NaN) | |
# Count the number of null values | |
dataset.isnull().sum() | |
# Dropping the TBG column as it contains extremely high number of null values | |
dataset.drop('TBG', axis = 1, inplace=True) | |
# Selecting columns with data type as 'object' | |
columns = dataset.columns[dataset.dtypes.eq('object')] | |
# Convert to numeric values | |
dataset[columns] = dataset[columns].apply(pd.to_numeric, errors='coerce') | |
# Replacing null values by mean | |
dataset['Age'].fillna(dataset['Age'].mean(), inplace = True) | |
dataset['T4U'].fillna(dataset['T4U'].mean(), inplace = True) | |
# Replacing null values by median | |
dataset['TSH'].fillna(dataset['TSH'].mean(), inplace = True) | |
dataset['T3'].fillna(dataset['T3'].median(), inplace = True) | |
dataset['TT4'].fillna(dataset['TT4'].median(), inplace = True) | |
dataset['FTI'].fillna(dataset['FTI'].median(), inplace = True) | |
# The gender data looks to be imbalanced with 0 lesser than 1 | |
# Replacing null values with 0 | |
dataset['Gender'].fillna(0, inplace = True) |
Step 3. Load the saved pre-trained model.
We are loading our pre-trained model. For this purpose, We use the load_model() method from Keras.
from keras.models import load_model | |
model = load_model('models/saved_model.h5') |
Step 4. Prepare the explainer for the model used for AI in medical diagnosis.
We are preparing the explainer for the process. Moreover, you can view below the parameters used in creating the explainer object.
explainer = dx.Explainer(model, X_train_df, Y_train_df, label='Hyporthyroidism') |

Step 5. Call model_parts() and plot the findings.
We remove the explanatory variable to check its impact on the result. To clarify, the method uses perturbations, like resampling from an empirical distribution or permutation of the values of the variable.
In particular, It calculates B = 10 permutations of variable importance on N = 1000 observations. Similarly, we can set custom values for B and N.
In summary, It appears that the features TT4, TSH, FTI were considered important in decision making.
explainer.model_parts().plot() |

Step 6. Check model performance with different loss functions.
explainer.model_performance() |
For example, we get following output.

In summary, the model is doing good with all loss functions.
Step 7.Explaining feature importance using SHAPLEY values.
explainer.model_parts(type='shap_wrapper').plot() |
To sum up, TT4, FTI, and TSH are important features. In this case, we are using SHAP calculations.

Step.8 Check PD -partial dependence.
To begin with, we check partial dependency scores. For example, we call model_profile().
Especially, partial dependence plots change the value of one explanatory variable.
Meanwhile, it keeps other variables. As a result, it identifies the impact of that one variable.
pd_rf = explainer.model_profile() | |
pd_rf.result |

Subsequently, we plot PD for a group of variables.
explainer.model_profile().plot(variables=['TT4', 'FTI', 'TSH', 'T3']) |

Step.9 Use surrogate_model() method.
In particular, this is an interesting concept. For example, it uses a surrogate model. For instance, here the library uses DecisionTreeRegressor.
surrogate_model = explainer.model_surrogate(max_vars=6, max_depth=3) | |
surrogate_model.performance |
surrogate_model.plot() |

Learning Tools and Strategies:
- Firstly, we can learn Additive Learning.
- Secondly, we need to understand game theory.
- Thirdly and most importantly, game theory helps to understand Shapley Additive explanations (SHAP).
- Moreover, we use DALEX to have better plots and its documentation is also very detailed.
- The SHAP library version has some issues. In other words, we can use the SHAP package as well.
In particular, the order mentioned above is more helpful to understand this concept easily. Further, you can find the references in the Citations section below.
Reflective Analysis:
Blackbox AI model is a big challenge. Moreover, we want reliable AI in medical diagnosis. Therefore, We need to explain models using XAI.
To sum up, without XAI, it is difficult to accept disease diagnosis. Thus, XAI is an imminent need in AI diagnosis.
Financial transactions are critical in nature. For instance, the decision to approve a loan is a critical AI use case. Especially, if we can not explain the loan rejection, customers become unhappy.
In short, We can use XAI as per need.
Conclusions and Future Directions for AI in medical diagnosis using XAI:
In conclusion, we have explained the importance of XAI in AI diagnosis. Further, XAI can explain neural networks. Therefore, it can explain nonlinear relationships.
Moreover, we use AI in financial decision making. For example, loan approvals, credit ranking, price prediction etc. So, we can use XAI for them as well.
In other words, we can use XAI everywhere. To sum up, XAI can help understand difficult diagnoses of Hypothyroid. In addition, It also helps us understand financial transactions.
Lastly, we discussed a few of the methods in this project. Similarly, we can learn other methods to use XAI.
Moreover, there are different tools. For instance, facets, What-if Tool and so on.
Citations:
- The dataset to perform AI in medical diagnosis is here.
- The model training and feature engineering details are here.
- DALEX library documentation is available here.
- SHAPLEY VALUE :- https://en.wikipedia.org/wiki/Shapley_value.
- Image Source: https://images.pexels.com/photos/5649885/pexels-photo-5649885.jpeg?cs=srgb&dl=pexels-thirdman-5649885.jpg&fm=jpg
I learned a little about XAI that was a concept that I didn’t know much. This article helped me to understand the objective of XAI: explain why an IA have an specific behavior to an expert person. This is useful for check some expert’s hypothesis (In more complex cases of research), and other uses of this is to explain in a confference how an specific AI model works. Greetings.
Nice to know that it helped you learn something.Yes, it can explain any model and expert hypothesis.