Algorithmic Bias (AS.110.365)

Course Information

(Spring 2022)

Professor: Susama Agarwala

e-mail: susama.agarwala@jhuapl.edu

Office hours: Thursday 1:30-2:30. Virtual. Zoom meeting id: 160 1929 5670, Passcode: 624779


TA: Nandan Kulkarni

e-mail: nkulkar8@jhu.edu

Office hours: Tuesday 1-2. Room Kreiger 204


Meeting times:

Lecture: MW 3-4:15 Hodson 311

Section: F 3 -3:50 Hodson 305


Overly ambitious Course Syllabus

This is a superset of the material we will cover in class. As the course evolves, and we inevitably have to cut material, check back to see updated versions.

Bias course syllabus.pdf

Material covered

This section will be updated regularly with a few suggested links for the topics to be covered in class, as well as my notes from the material that I just covered. Click on the link associated to the lecture for my notes.

The supplementary readings are by no means the ONLY source of informations. Students are encouraged to use other statistics texts and sources that they are more comforatable with in order to get a different take on the material.

January 24: Introduction to AI Bias

January 26: Gauss Markov Assumptions and Linear Regression

Supplementary readings: Econometric Analysis (Greene) Chapters 2, 3, 4

Gauss-Markov Theorem and Ordinary Least Square Assumptions

You Tube lecture on Gauss-Markov

Lecture notes

January 31: Gauss Markov Assumptions pt. 2 and ommitting confounding variables

Ommitting confounding variables

Lecture notes

February 2: Confounding Variables and Pearson's R

30 million word gap and here

Vaccination and political leaning picture taken from here

Lecture notes

February 4: Section

Slides

February 7: More Pearson's R; Regression without Gauss Markov

Mostly Harmless Econometrics (Chapter 3)

Lecture notes

February 9: Conditional Expectation Functions (Regression without Gauss Markov)

Mostly Harmless Econometrics (Chapter 3)

Lecture notes (note that proofs not done in class are available in these notes)

Feb 14: Review of CEF and regression interpretation concepts

Lecture notes

Feb 16: Logistic Regressions

Latent variables for binary response

Econometric Analysis (Greene) Chapter 21

Lecture notes

Feb 21 & 23:

Working with data scientifically

Hans Rosling:

1) Debunking third-world myths

2) The magic washing machine

3) How not to be ignorant about the world

4) Religions and babies

What not to do with data

Cathy O'Neil:

1) Era of blind faith in big data must end

2) Weapons of Math Destruction

Joy Buolamwini

1) Compassion through computation

Timnit Gebru

1) How to stop AI from marginalizing communities

Ruha Benjamin

1) From park bench to lab bench (discussing research design more generally)

Feb 28:

Lecture Notes

Why binary predictors are tricky.R

March 2: Confusion matrices and ROC curves

March 7: Brief introduction to machine learning

Lecture Notes

Perceptrons:

Biologically motivated

Towards data science

Neural Networks:

IBM

Towards data science

Slightly more indepth article (with use cases)

Hands on :

Tensorflow playground

March 9:

Lecture Notes

Shannon Entropy

KL Divergence

Cross Entropy I

Cross Entropy II

March 14: Review of homework 6

Code from class


March 16:

Recording from class

Lecture notes

Gradient Descent

Back Propagation


March 28: Different measures of fairness

Fairness in Criminal Justice Risk Assessments: The State of the Art

Lecture Notes


March 30: Fairness Impossibility proofs

ROC curves

Lecture notes


April 4: Efficiency/ Fairness Tradeoff

Equality of Opportunity in Supervised Learning

Lecture notes


April 6: Efficiency/ Fairness Tradeoff

Lecture notes

April 11: Discussion of Fairness and Tradeoffs

April 16: Discussion for HW 9

April 18: Algorithmic Redlining

Dummy variables

April 21: Algorithmic Relining/ When tradeoffs aren't so bad

April 25: When tradeoffs aren't so bad

Aprli 27: Final Project

The Use and Misuse of Counterfactuals in Ethical Machine Learning

Chasing Your Long Tails: Differentially Private Prediction in Health Care Settings

“This Whole Thing Smacks of Gender”: Algorithmic Exclusion in Bioimpedance-based Body

Composition Analysis

Removing Spurious Features can Hurt Accuracy and Affect Groups Disproportionately

Algorithmic Fairness in Predicting Opiod Use Disorder Using Machine Learning

An Agent-based Model to Evaluate Interventions on Online Dating Platforms to Decrease

Racial Homogamy










Homeworks

There will be 10 homework assignments this semester, posted on Fridays, and due the Monday 10 days following. Both assignments and solutions will be posted here.

Homework 1 (Due February 7)

homework1.csv

TA Solutions


Homework 2 (Due February 14)

Baseball data, source from here

hw1generation.R

TA Solutions


Homework 3 (Due February 21)

school_data.csv

TA Solutions


Homework 4 (Due February 28)


Homework 5 (Due March 7)

science experiment.csv

TA Solutions


Homework 6 (Due March 14)

full_school_data.csv

TA Solutions


Homework 7 (Due March 28)

TA Solutions


Homework 8 (Due April 4)

Paper on fairness

TA Solutions


Homework 9 (Due April 13)

Paper on tradeoffs

fake school data.R

TA Notes

TA Solutions


Homework 10 (Due April 22)

Algorithmic Redlining I

Algorithmic Redlining II


Final Project (Due April 27)

  • Pick 2 papers from FAccT 2021, give a 20 minute presentation on the two combined.

  • Paper selection (Due April 15)

  • At least one paper must be on the topic of fairness

  • At least one paper must discuss a classifier or a regression method

  • List of accepted papers