Explainable AI - Schedule & Reading

Notes on Lecture Slides

You can access the lecture slides for this course on Canvas by navigating to the "Files" tab. These slides are currently only available to enrolled students, but we plan to release them to the public once the course has concluded.

Tentative schedule & reading

Week 1
- 3/31: Introduction
  - Reading: No reading
- 4/2: Background
  - Reading: Statistical Modeling: The Two Cultures
Week 2
- 4/7: Removal-based explanations 1
  - Reading: RISE: Randomized Input Sampling for Explanation of Black-box Models
- 4/9: Removal-based explanations 2
  - Reading: "Why Should I Trust You?": Explaining the Predictions of Any Classifier, (Optional) Feature Removal Is a Unifying Principle for Model Explanation Methods
Week 3
- 4/14: Shapley values 1
  - Reading: Understanding Global Feature Contributions with Additive Importance Measures
- 4/16: Shapley values 2
  - Reading: Shapley explainability on the data manifold
Week 4
- 4/21: Propagation and gradient-based explanations 1
  - Reading: SmoothGrad: Removing Noise by Adding Noise
- 4/23: Propagation and gradient-based explanations 2 + Representation explainability
  - Reading: Label-Free Explainability for Unsupervised Models
Week 5
- 4/28: Amortized optimization
  - Reading: A Benchmark for Interpretability Methods in Deep Neural Networks
- 4/30: Evaluating explanation methods
  - Reading: Evaluations and Methods for Explanation through Robustness Analysis
Week 6: Inherently interpretable models
- 5/5: Inherently interpretable models 1
  - Reading: Distilling Interpretable Models into Human-Readable Code
- 5/7: Inherently interpretable models 2
  - Reading: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
Week 7
- 5/12: Concept-based explanations
  - Reading: Concept Bottleneck Models, (Optional) Feature Visualization
- 5/14: Sparse autoencoders
  - Reading: Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)
Week 8
- 5/19: Instance explanations 1
  - Reading: Data Shapley: Equitable Valuation of Data for Machine Learning
- 5/21: Counterfactual explanations
  - Reading: Explanation by Progressive Exaggeration
Week 9:
- 5/26: Hima Lakkaraju's Guest Lecture: Explainable AI for Real-World Decisions & Engineering Systems: Algorithmic Foundations and Practical Considerations
  - Readings: Explanation by Progressive Exaggeration, Who Gets Credit or Blame? Attributing Accountability in Modern AI Systems, Towards Bridging the Gaps between the Right to Explanation and the Right to be Forgotten
- 5/28: LLM explainability 1
  - Reading: Faithfulness vs. Plausibility: On the (Un)Reliability of Explanations from Large Language Models
Week 10
- 6/2: LLM explainability 2
  - Reading: Questioning the AI: Informing Design Practices for Explainable AI User Experiences
- 6/4: XAI in practice II –Model improvement, applications to healthcare
  - Reading: Explainable Machine Learning Predictions for the Prevention of Hypoxaemia During Surgery