Séance du 17 mars 2024
Séance organisée par Alain Célisse et Laure Sansonnet
Lieu : IHP, amphi Yvonne Choquet-Bruhat (second étage du bâtiment Perrin)
14.00 : Hugo Chardon (CREST - ENSAE)
Titre : Finite-sample performance of the maximum likelihood estimator in logistic regression: density estimation and classification
Résumé : Logistic regression is a classical model for describing the probabilistic dependence of binary responses to multivariate covariates. We consider the predictive performance of the maximum likelihood estimator (MLE) for logistic regression, assessed in terms of logistic risk. We consider two questions: first, that of the existence of the MLE (which occurs when the dataset is not linearly separated), and second that of its accuracy when it exists. These properties depend on both the dimension of covariates and on the signal strength. In the case of Gaussian covariates and a well-specified logistic model, we obtain sharp non-asymptotic guarantees for the existence and excess logistic risk of the MLE. We then generalize these results in two ways: first, to non-Gaussian covariates satisfying a certain two-dimensional margin condition, and second to the general case of statistical learning with a possibly misspecified logistic model. We will also present recent results on the performance of the MLE as a binary classifier when the logistic loss is used as a convex surrogate for the binary loss in supervised classification. We obtain fast rates that should be compared with those obtained under the so-called margin assumption of Mammen and Tsybakov from the literature on supervised classification. This is done by refining the analysis from Zhang's lemma, which in our setting only provides a slow rate.
This talk is based on joint works with Matthieu Lerasle and Jaouad Mourtada.
15.00 : Nina Vesseron (CREST - ENSAE)
Titre : Sample and Map from a Single Convex Potential: Generation using Conjugate Moment Measures
Résumé : A common approach to generative modeling is to split model-fitting into two blocks: define first how to sample noise (e.g. Gaussian) and choose next what to do with it (e.g. using a single map or flows). We explore in this work an alternative route that ties sampling and mapping. We find inspiration in moment measures, a result that states that for any measure $\rho$ supported on a compact convex set of $\mathbb{R}^d$, there exists a unique convex potential $u$ such that $\rho=\nabla u\,\sharp\,e^{-u}$. While this does seem to tie effectively sampling (from log-concave distribution $e^{-u}$) and action (pushing particles through $\nabla u$), we observe on simple examples (e.g., Gaussians or 1D distributions) that this choice is ill-suited for practical tasks. We study an alternative factorization, where $\rho$ is factorized as $\nabla w^*\,\sharp\,e^{-w}$, where $w^*$ is the convex conjugate of $w$. We call this approach conjugate moment measures, and show far more intuitive results on these examples. Because $\nabla w^*$ is the Monge map between the log-concave distribution $e^{-w}$ and $\rho$, we rely on optimal transport solvers to propose an algorithm to recover $w$ from samples of $\rho$, and parameterize $w$ as an input convex neural network.
16.00 : Erwan Scornet (LPSM - SU)
Titre : Going beyond the fear of emptyness to gain (rates of) consistency
Résumé : Missing data are ubiquitous in many real-world datasets as they naturally arise from gathering information from various sources in different format. Most statistical analyses have focused on estimation in parametric models despite missing values. However, accurate estimation is not sufficient to make predictions on a test set that contains missing data: a manner to handle missing entries must be designed. In this talk, we will analyze two different approaches to predict in presence of missing data: pattern-by-pattern and zero-imputation strategies. We will establish upper bounds on the excess risk of these methods, in the context of linear models. These results are part of the PhD thesis of Alexis Ayme, co-supervised with Claire Boyer and Aymeric Dieuleveut.
Related papers:
- Near-optimal rate of consistency for linear models with missing values https://proceedings.mlr.press/v162/ayme22a/ayme22a.pdf
- Naive imputation implicitly regularizes high-dimensional linear models https://arxiv.org/abs/2301.13585
- Random features models: a way to study the success of naive imputation https://arxiv.org/abs/2402.03839