Privacy in Machine Learning


Introduction

Privacy-preserving machine learning aims at protecting individuals' privacy whose data is used to train a machine learning model, when the trained model is released for public use. A gold standard notion, called differential privacy, provides a provable level of privacy for the trained model, when a carefully calibrated amount of randomness (noise) is added to the training process. The course covers how this notion of privacy is employed in modern machine learning, in particular, for supervised learning, semi-supervised learning, and unsupervised learning tasks. It also covers how the privacy notion trades off with other emerging notions in machine learning such as interpretability, fairness, and causality.

Learning Outcomes

By the end of this course, you will learn the skills and knowledge (if you don’t know already) or you will reinforce (if you do know):

  • What differential privacy is

  • Understand differentially private mechanisms employed in modern machine learning algorithms

  • How DP algorithms are employed in different Supervised, unsupervised, semi-supervised learning settings

  • How DP algorithms are employed in Bayesian inference (posterior sampling and approximate Bayesian inference methods)

  • Model interpretability/explainability, fairness, causality and how they trade-off with DP

  • Attack methods such as membership inference and data reconstruction attacks

Similar Courses Around The World

Similar courses are available in the world. However, they typically have a different focus from my course. Such courses include :

These course materials could be useful to look at as additional sources. My course covers similar materials as those above at the beginning, while it diverges from these existing courses as we move toward the middle and the end of the term. Please take a look at the Syllabus for more information.