1. Concepts & Definitions
1.1. Regression versus Classification
1.3. Parameter versus Hyperparameter
1.4. Training, Validation, and Test
2. Problem & Solution
2.1. Gaussian Mixture x K-means on HS6 Weight
2.2. Evaluation of classification method using ROC curve
2.3. Comparing logistic regression, neural network, and ensemble
2.4. Fruits or not, split or encode and scale first?
According to [1]:
"Artificial neural networks (ANNs) are powerful tools for data analysis and are particularly suitable for modeling relationships between variables for best prediction of an outcome. While these models can be used to answer many important research questions, their utility has been critically limited because the interpretation of the “black box” model is difficult.
Clinical investigators usually employ ANN models to predict the clinical outcomes or to make a diagnosis; the model however is difficult to interpret for clinicians. To address this important shortcoming of neural network modeling methods, we describe several methods to help subject-matter audiences (e.g., clinicians, medical policy makers) understand neural network models. Garson’s algorithm describes the relative magnitude of the importance of a descriptor (predictor) in its connection with outcome variables by dissecting the model weights."
But, there is one important observation [3]:
"A method described in Garson 1991 (also see Goh 1995) identifies the relative importance of explanatory variables for a single response variables in a supervised neural network by deconstructing the model weights. The original algorithm indicates relative importance as the absolute magnitude from zero to one. The algorithm currently only works for neural networks with one hidden layer and one response variable."
Adapting the development made in [2], it is possible to adapt the code to implement the Garson's algorithm.
Let's start to show how Garson's algorithm could be employed in a neural network. First, load all the code available in the following Google Colab (click on the link):
https://colab.research.google.com/drive/1ANZ6eRdPCr7gCGmg4tkgLyFwG_pwLymJ?usp=sharing
Then insert the code related with the Garson's algoritm:
import keras
import numpy as np
# Taken from https://csiu.github.io/blog/update/2017/03/29/day33.html
def garson(A, B):
"""
Computes Garson's algorithm
A = matrix of weights of input-hidden layer (rows=input & cols=hidden)
B = vector of weights of hidden-output layer
"""
B = np.diag(B)
# connection weight through the different hidden node
cw = np.dot(A, B)
# weight through node (axis=0 is column; sum per input feature)
cw_h = abs(cw).sum(axis=0)
# relative contribution of input neuron to outgoing signal of each hidden neuron
# sum to find relative contribution of input neuron
rc = np.divide(abs(cw), abs(cw_h))
rc = rc.sum(axis=1)
# normalize to 100% for relative importance
ri = rc / rc.sum()
return(ri)
# Adapted from https://csiu.github.io/blog/update/2017/03/29/day33.html
def VarImpGarson(model):
#Computes Garson's algorithm
#A = matrix of weights of input-hidden layer (rows=input & cols=hidden)
#B = vector of weights of hidden-output layer
A = model.layers[0].get_weights()[0]
B = model.layers[len(model.layers)-1].get_weights()[0]
varScores = 0
for i in range(B.shape[1]):
varScores += garson(A, np.transpose(B)[i])
print("Importance of input variables: ",
np.array(varScores).argsort()[-10:][::-1])
return varScores
Now, it is time to test the functions created.
garson_metrics = VarImpGarson(model)
print(garson_metrics)
Importance of input variables:
[0 1]
[0.56102777 0.4389723 ]
The previous code indicates that the input variable x0 has an importance of 56% and x1 has an importance of 44% in the final output of the neural network.
The Python code with all the steps is summarized in this Google Colab (click on the link):
https://colab.research.google.com/drive/1fkJ_I5EDW3l0URNPIHxXVV0y2AMa-hfj?usp=sharing
[1] Zhang Z, Beck MW, Winkler DA, Huang B, Sibanda W, Goyal H; written on behalf of AME Big-Data Clinical Trial Collaborative Group. Opening the black box of neural networks: methods for interpreting neural network models in clinical applications. Ann Transl Med. 2018 Jun;6(11):216. doi: 10.21037/atm.2018.05.32. PMID: 30023379; PMCID: PMC6035992.