On Explainability of Vision Transformers

Samuel Quan

Authors: Samuel Quan, Azadeh Famili, and Dr. Yingjie Lao

Faculty Mentor: Dr. Yingjie Lao

College: College of Engineering, Computing, and Applied Sciences



ABSTRACT

With its robust pattern recognition and prediction capabilities, machine learning technology sees much use in various industries, including healthcare, banking, security, and even entertainment. Although the predictions made by most machine learning models are generally accurate, the reasoning behind them is not easily explainable. This opaqueness becomes problematic when assessing the validity of a model's predictions on data outside its training set. Moreover, by using explainability techniques on machine learning models, we can reveal issues with the training algorithm or hidden biases in the dataset. If left unchecked, these problems potentially lead to faulty models, such as automated loaning and credit algorithms that discriminate based on race or identification software that focuses on extraneous features.


Understanding how they make decisions is essential to ensure that machine learning models are accurate and equitable. This project uses explainability techniques on a vision transformer model proposed by Google researchers by visualizing each input pixel’s relative importance to the model’s classification.

Video Introduction

Samuel Quan 2022 Undergraduate Poster Forum