AI Bias and Explainability

As AI becomes more ubiquitous in our lives, it is increasingly important that models are accurate. One way to tackle this is through AI explainability or interpretable machine learning models. Below are some links.

This is how AI bias really happens—and why it’s so hard to fix (Karen Hao, MIT Technology Review, Feb 2019)

https://www.technologyreview.com/2019/02/04/137602/this-is-how-ai-bias-really-happensand-why-its-so-hard-to-fix/

The birth of machine bias

https://spectrum.ieee.org/tech-talk/tech-history/dawn-of-electronics/untold-history-of-ai-the-birth-of-machine-bias

Computer program to automate admissions

Machine Learning and Human Bias (video from Google)

https://www.youtube.com/watch?v=59bMh59JQDo

Interaction, Latent & Selection bias examples

Notes on AI Bias

https://www.ben-evans.com/benedictevans/2019/4/15/notes-on-ai-bias

What do we do about bias?

1. Methodological rigour in the collection and management of the training data
2. Technical tools to analyse and diagnose the behavior of the model.
3. Training, education and caution in the deployment of ML in products.

“Machine Learning can do anything you could train a dog to do - but you’re never totally sure what you trained the dog to do.”

Introducing Activation Atlases (OpenAI, March 2019)

https://openai.com/blog/introducing-activation-atlases/

Visualizing what interactions between neurons can represent

Representer Point Selection for Explaining Deep Neural Networks

Block post: https://blog.ml.cmu.edu/2019/04/19/representer-point-selection-explain-dnn

Code: https://github.com/chihkuanyeh/Representer_Point_Selection

Model explainability by examining the training examples most relevant to a particular output.

"not only help us understand the predictions of a given DNN, but also provide insights into how to improve the performance of the model."

Alibi

https://github.com/SeldonIO/alibi

An open source Python library aimed at machine learning model inspection and interpretation. The initial focus on the library is on black-box, instance based model explanations.

Do Explanations Reflect Decisions? A Machine-centric Strategy to Quantify the Performance of Explainability Algorithms [2019]

Quantitative performance metrics for explainability algorithms
- i) Impact Score: assesses the % of critical factors with either strong confidence reduction impact or decision changing impact
- ii) Impact Coverage, which assesses the % coverage of adversarially impacted factors in the input
Analysed LIME, SHAP, Expected Gradients, GSInquire on a ResNet-50 deep convolutional neural network using a subset of ImageNet for the task of image classification
LIME within the tested images had the lowest impact on the decision-making process of the network (~38%), with progressive increase in decision-making impact for SHAP (~44%), Expected Gradients (~51%), and GSInquire (~76%)

https://arxiv.org/abs/1910.07387

Articles discussing algorithmic bias examples

Racial Bias Found in Algorithms That Determine Health Care for Millions of Patients

https://spectrum.ieee.org/the-human-os/biomedical/ethics/racial-bias-found-in-algorithms-that-determine-health-care-for-millions-of-patients

"The problem arose from the design of the algorithm, and specifically, what it was designed to predict. In trying to determine who would most benefit from the care management program, it predicted each patient’s medical costs over the coming year. It based its predictions on historical data."

Interpretability Code

https://github.com/interpretml/interpret

https://madewithml.com/topics/interpretability/

Animations of Neural Networks Transforming Data -- better intuition about why and how neural networks work

https://towardsdatascience.com/animations-of-neural-networks-transforming-data-42005e8fffd9

Animations of how neural networks transform data to be nearly linearly separable
Contains simple python code

Examples:

https://arxiv.org/pdf/2009.05383v1.pdf "erroneous indicators in the CT images (e.g., patient tables of the CT scanners, imaging artifacts, etc.) were being leveraged by the network to make predictions. To help prevent this behaviour, we introduce an additional augmentation which removes any visual indicators which lie outside of the patient’s body, as illustrated in Figure 5"

Google Sites

Report abuse