CSEP 590B Explainable AI
General Information
Lecture time: Tuesdays, 6:30-9:20 pm
Location: Bill & Melinda Gates Center (CSE2) G10
Instructors: Su-In Lee and Ian Covert
Teaching assistants: Hugh Chen and Chris Lin
Office hours
Su-In Lee: Thursdays 5:00-6:00pm @ Zoom
Ian Covert: Sundays 8:00-9:00pm @ Zoom
Hugh Chen: Tuesdays 5:30-6:30pm @ Gates 131
Chris Lin: Fridays 4:30-5:30pm @ Zoom
*Note about the course materials
If you are teaching an XAI course and want to use any of our course materials, please feel free to reach out to us. We designed our slides and homeworks from scratch, and we hope they can be useful to many students and researchers working in this area.
Overview
This course is about explainable artificial intelligence (XAI), a subfield of machine learning that provides transparency for complex models. Modern machine learning relies heavily on black-box models like tree ensembles and deep neural networks; these models provide state-of-the-art accuracy, but they make it difficult to understand the features, concepts, and data examples that drive their predictions. As a consequence, it's difficult for users, experts, and organizations to trust such models, and it's challenging to learn about the underlying processes we're modeling.
In response, some argue that we should rely on inherently interpretable models in high-stakes applications, such as medicine and consumer finance. Others advocate for post-hoc explanation tools that provide a degree of transparency even for complex models. This course explores both perspectives, and we'll discuss a wide range of tools that address different questions about how models makes predictions. We'll cover many active research areas in the field, including feature attribution, counterfactual explanations, instance explanations and human-AI collaboration.
The course consists of 10 lectures (each session is 3 hours long) and is structured as follows:
Introduction and motivation (1 lecture)
Feature importance: removal-based explanations, propagation-based explanations, evaluation metrics (4 lectures)
Other explanation paradigms: inherently interpretable models, concept explanations, counterfactual explanations, instance explanations, neuron interpretation (3 lectures)
Human-AI collaboration (1 lecture)
Applications in industry (1 lecture)
Communication
Ed discussion board: https://edstem.org/us/courses/21664/discussion
Canvas: https://canvas.uw.edu/courses/1545385
Course email list: csep590b_sp22@cs.washington.edu
Textbooks
We won't use a textbook because there isn't one that covers enough content (although Christoph Molnar's online book is quite good). Instead, we'll directly reference recent research papers.
Machine learning resources
Andrew Ng’s lecture notes from Stanford CS 229
Kevin Jamieson’s course website for UW CSE 546
The Elements of Statistical Learning
Computer Age Statistical Inference
Grading
50% Homework
40% Paper discussion posts
Bonus: Leading discussion
10% In-class participation
Homework
There are three homework assignments, plus one review assignment:
HW0: Warm-up (30 points) PDF, Latex source
Due: Week 2, 11:59pm Monday April 4
HW1: Feature importance (100 points) PDF, Latex source
Due: Week 5, 11:59pm Monday April 25
HW2: Gradient-based explanations and metrics (100 points) PDF, Latex source
Due: Week 8, 11:59pm Monday May 16
HW3: Inherently interpretable models and instance explanations (100 points) PDF, Latex source
Due: Week 10, 11:59pm Wednesday June 1
Collaboration policy: Students must submit their own answers and their own code for programming problems. Limited collaboration is allowed, but you must indicate on the homework with whom you collaborated.
Late policy: Homeworks must be submitted online on Canvas by the posted due date. The penalty for late work is 20 points per day, and each student gets 3 free late days for the quarter.
Lecture schedule & reading
Lecture 1: Introduction
Lecture 2: Removal-based explanations
Lecture 3: Shapley values
Lecture 4: Propagation and gradient-based explanations
Slides: PDF
Lecture 5: Evaluating explanation methods
Slides: PDF
Reading: A Benchmark for Interpretability Methods in Deep Neural Networks
Lecture 6: Inherently interpretable models
Slides: PDF
Reading: Distilling Interpretable Models into Human-Readable Code
Lecture 7: Concept-based explanations, neuron interpretation
Reading: Concept Bottleneck Models, (Optional) Feature Visualization
Lecture 8: Counterfactual explanations, instance explanations
Lecture 9: Enhancing human-AI collaboration
Lecture 10: Model improvement, applications in industry and healthcare