As AI becomes more ubiquitous in our lives, it is increasingly important that models are accurate. One way to tackle this is through AI explainability or interpretable machine learning models. Below are some links.
This is how AI bias really happens—and why it’s so hard to fix (Karen Hao, MIT Technology Review, Feb 2019)
The birth of machine bias
Computer program to automate admissions
Machine Learning and Human Bias (video from Google)
https://www.youtube.com/watch?v=59bMh59JQDo
Interaction, Latent & Selection bias examples
Notes on AI Bias
https://www.ben-evans.com/benedictevans/2019/4/15/notes-on-ai-bias
What do we do about bias?
Methodological rigour in the collection and management of the training data
Technical tools to analyse and diagnose the behavior of the model.
Training, education and caution in the deployment of ML in products.
“Machine Learning can do anything you could train a dog to do - but you’re never totally sure what you trained the dog to do.”
Introducing Activation Atlases (OpenAI, March 2019)
https://openai.com/blog/introducing-activation-atlases/
Visualizing what interactions between neurons can represent
Representer Point Selection for Explaining Deep Neural Networks
Block post: https://blog.ml.cmu.edu/2019/04/19/representer-point-selection-explain-dnn
Code: https://github.com/chihkuanyeh/Representer_Point_Selection
Model explainability by examining the training examples most relevant to a particular output.
"not only help us understand the predictions of a given DNN, but also provide insights into how to improve the performance of the model."
Alibi
https://github.com/SeldonIO/alibi
An open source Python library aimed at machine learning model inspection and interpretation. The initial focus on the library is on black-box, instance based model explanations.
Do Explanations Reflect Decisions? A Machine-centric Strategy to Quantify the Performance of Explainability Algorithms [2019]
Quantitative performance metrics for explainability algorithms
i) Impact Score: assesses the % of critical factors with either strong confidence reduction impact or decision changing impact
ii) Impact Coverage, which assesses the % coverage of adversarially impacted factors in the input
Analysed LIME, SHAP, Expected Gradients, GSInquire on a ResNet-50 deep convolutional neural network using a subset of ImageNet for the task of image classification
LIME within the tested images had the lowest impact on the decision-making process of the network (~38%), with progressive increase in decision-making impact for SHAP (~44%), Expected Gradients (~51%), and GSInquire (~76%)
https://arxiv.org/abs/1910.07387
Racial Bias Found in Algorithms That Determine Health Care for Millions of Patients
"The problem arose from the design of the algorithm, and specifically, what it was designed to predict. In trying to determine who would most benefit from the care management program, it predicted each patient’s medical costs over the coming year. It based its predictions on historical data."
https://github.com/interpretml/interpret
https://madewithml.com/topics/interpretability/
Animations of Neural Networks Transforming Data -- better intuition about why and how neural networks work
https://towardsdatascience.com/animations-of-neural-networks-transforming-data-42005e8fffd9
Animations of how neural networks transform data to be nearly linearly separable
Contains simple python code
Examples:
https://arxiv.org/pdf/2009.05383v1.pdf "erroneous indicators in the CT images (e.g., patient tables of the CT scanners, imaging artifacts, etc.) were being leveraged by the network to make predictions. To help prevent this behaviour, we introduce an additional augmentation which removes any visual indicators which lie outside of the patient’s body, as illustrated in Figure 5"