[ ISIT 2022 Tutorial ]

Information-Theoretic Tools for Responsible Machine Learning

When: June 26 (Sunday) 9:30 AM - 12:30 PM (EEST, local Helsinki time)

This tutorial will present a survey of recent developments in fair and private machine learning (ML), describe open challenges in responsible data science, and serve as a call to action for the information theory community to engage with problems of social importance. Our tutorial is divided into two parts. First, we survey recent results at the intersection of differential privacy and information theory. Differential privacy has become the de facto standard for privacy adopted in both government and industry. Our tutorial will review key definitions, metrics, and mechanisms used in differential privacy and recent developments in the field. We aim to present these topics using mathematical tools familiar to information theorists. Our tutorial will show that differential privacy metrics can be cast in terms of f-divergences and Rényi divergences. We demonstrate how to apply the properties of these divergences to analyze differentially private algorithms deployed in ML applications (e.g., online learning and deep learning). We describe how the information-theoretic vantage point reveals fundamental operational limits of differentially private statistical analysis. We also discuss recent advances in differential privacy mechanism design for ML. The first half of the tutorial will conclude with a description of open problems in differential privacy that may benefit from information-theoretic techniques.

The second part of the tutorial will focus on fair ML. Our goal is to present recent developments in the field of fair ML through an information-theoretic lens. We start by overviewing metrics for evaluating fairness and discrimination, including individual fairness, group fairness, predictive multiplicity, and fair use. We formulate these metrics using easy-to-understand and unified notation based on error rates and divergences. We then present recent results on the limits of fair classification. These limits include trade-offs between splitting classifiers across different demographic groups and are proved using standard converse results familiar to information theorists. We also overview state-of-the-art fairness interventions, describing and contrasting several techniques developed over the past five years. Finally, we outline open problems in the field.

The tutorial will conclude with a hands-on demo of software packages for private and fair ML targeted toward student and post-doc attendees. No previous background in privacy or fair ML is required.

Slides & Notes

  • Part 1: Intro (Slides)

  • Part 2: Information-theoretic Tools for Differential Privacy (Slides)

  • Part 3: (Central) Differential in Privacy in Machine Learning (Slides)

  • Part 4: Fairness in Machine Learning (Slides)

  • Part 5: Fairness Interventions (Slides, Notes)

  • Open Problems (Slides)

  • Hands-on Examples (Code)

List of Speakers

Flavio P. Calmon

(Harvard University)

Mario Diaz

(Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas)

Shahab Asoodeh

(McMaster University)

Haewon Jeong