Séance du 8 février 2021

Séance organisée par Cristina Butucea et Pierre Latouche.

Séance diffusée via Zoom (lien ici).


14.00 : Aymeric Dieuleveut (Ecole Polytechnique)

Titre : Debiasing Stochastic Gradient Descent to handle missing values

Résumé : Stochastic gradient algorithm is a key ingredient of many machine learning methods, particularly appropriate for large-scale learning.However, a major caveat of large data is their incompleteness.We propose an averaged stochastic gradient algorithm handling missing values in linear models. This approach has the merit to be free from the need of any data distribution modeling and to account for heterogeneous missing proportion.In both streaming and finite-sample settings, we prove that this algorithm achieves convergence rate of O(1/n) at the iteration n, the same as without missing values. We show the convergence behavior and the relevance of the algorithm not only on synthetic data but also on real data sets, including those collected from medical register.


Joint work with Aude Sportisse, Claire Boyer, and Julie Josse


15.00 : Guillem Rigaill (INRAE)

Titre : Detecting Abrupt Changes in the Presence of Local Fluctuations and Autocorrelated Noise

Résumé : Whilst there are a plethora of algorithms for detecting changes in mean in univariate time-series, almost all struggle in real applications where there is autocorrelated noise or where the mean fluctuates locally between the abrupt changes that one wishes to detect. In these cases, default implementations, which are often based on assumptions of a constant mean between changes and independent noise, can lead to substantial over-estimation of the number of changes. We propose a principled approach to detect such abrupt changes that models local fluctuations as a random walk process and autocorrelated noise via an AR(1) process. We then estimate the number and location of changepoints by minimising a penalised cost based on this model. We develop a novel and efficient dynamic programming algorithm, DeCAFS, that can solve this minimisation problem; despite the additional challenge of dependence across segments, due to the autocorrelated noise, which makes existing algorithms inapplicable. We apply our method to measuring gene expression levels in bacteria.

Joint work with Gaetano Romano, Vincent Runge, Paul Fearnhead


16.00 : Sylvain Le Corff (Telecom Sud Paris)

Titre : Deconvolution with unknown noise distribution

Résumé : In this talk, we consider the deconvolution problem in the case where the target signal is multidimensional and when no information is known about the noise distribution. The deconvolution problem is solved based only on the corrupted signal observations. We establish the identifiability of the model up to translation when the signal has a Laplace transform with an exponential growth smaller than 2 and when it can be decomposed into two dependent components. We also propose an estimator of the probability density function of the signal without any assumption on the noise distribution. We discuss the rate of convergence of this estimator and present some practical applications.

Based on joint works with Elisabeth Gassiat and Luc Lehéricy.