Optimal transport and machine learning

Four introductory tutorials

Held virtually at IMSI

July 6-7, 2021


The tutorials are held virtually at IMSI. Each of the four lectures is about 2 hours long. The primary intended audience includes students and researchers in related fields, the only prerequisite is basic knowledge of probability theory and statistics.

These tutorials form part of a larger sequence and serve as introduction to a research workshop on applied optimal transport that will take place in May 2022 within the Spring 2022 Long Program at IMSI. Funding by the National Science Foundation is gratefully acknowledged.

Organizer: Marcel Nutz (mnutz@columbia.edu)

Registration

Participation is free and open, but registration is required to obtain Zoom access. To register, please go to the program website. Login to (or create) your IMSI account as directed and click Apply. A few days before the start date, you will receive an email with a Zoom link valid for the entire program.

Introduction to optimal transport

Marcel Nutz (Columbia)

July 6th, 10 a.m. (EDT)

This tutorial introduces the optimal transport problem and some of its fundamental mathematical results: existence of optimal transports, geometric characterization, dual problem, etc.

Distribution-free nonparametric inference using optimal transport

Bodhisattva Sen (Columbia)

July 6th, 2 p.m. (EDT)

Nonparametric statistics, a subfield of statistics that came into being after the introduction of Wilcoxon's tests in Wilcoxon (1945), is traditionally associated with distribution-free methods, e.g., hypothesis testing problems where the null distribution of the test statistic does not depend on the unknown data generating distribution. Although enormous progress has been made in this field in the last 100 years, most of the distribution-free methods have been restricted to one-dimensional problems.

Recently, using the theory of optimal transport, many distribution-free procedures have been extended to multi-dimensions. Prominent examples include: (a) two-sample equality of distributions testing, (b) testing for mutual independence, etc. In this lecture, I will summarize these recent developments and: (i) provide a general framework for distribution-free nonparametric testing in multi-dimensions, (ii) propose multivariate analogues of some classical methods (including Wilcoxon's tests), and (iii) study the (asymptotic) efficiency of the proposed distribution-free methods. I will also compare and contrast these distribution-free methods with kernel-based methods that are very popular in the machine learning literature. In summary, I will illustrate that these distribution-free methods are as powerful/efficient as their traditional counterparts and more robust to heavy-tailed outliers and contamination.

Learning with optimal transport

Aude Genevay (MIT)

July 7th, 10 a.m. (EDT)

In this tutorial, we will start by introducing different notions of distance between probability measures and see how they compare on toy learning problems, both from a theoretical and practical perspective. We will then focus on regularized optimal transport and provide fast algorithms to compute it, along with theoretical guarantees. We will finish with an overview of learning problems that can be tackled using optimal transport.

Statistical estimation and optimal transport

Jonathan Niles-Weed (NYU)

July 7th, 2 p.m. (EDT)

In this tutorial, we will consider a fundamental statistical question of optimal transport: how well can optimal transport distances and maps be estimated from data? We will discuss the pervasive "curse of dimensionality" in the statistical theory of optimal transport and discuss assumptions—such a smoothness, sparsity, and low-dimensionality—which can partially ameliorate this curse. We will also explore minimax lower bounds, which establish that in the absence of such assumptions, it is not possible to avoid the curse of dimensionality entirely. These pessimistic results help to motivate several variants of optimal transport which have recently become popular in machine learning.