Séance du 16 octobre 2023

Séance organisée par Guillermo Durand et Thanh Mai Pham Ngoc

Lieu : IHP,  salle 314 (Grisvard)


14.00 : Marie Perrot-Dockès (Université Paris Cité)

Titre :  Selective inference for false discovery proportion in a hidden Markov model

Résumé : We address the multiple testing problem under the assumption that the true/false hypotheses are driven by a hidden Markov model (HMM), which is recognized as a fundamental setting to model multiple testing under dependence since the seminal work of Sun and Cai (J R Stat Soc Ser B (Stat Methodol) 71:393–424, 2009). While previous work has concentrated on deriving specific procedures with a controlled false discovery rate under this model, following a recent trend in selective inference, we consider the problem of establishing confidence bounds on the false discovery proportion, for a user-selected set of hypotheses that can depend on the observed data in an arbitrary way. We develop a methodology to construct such confidence bounds first when the HMM model is known, then when its parameters are unknown and estimated, including the data distribution under the null and the alternative, using a nonparametric approach. In the latter case, we propose a bootstrap-based methodology to take into account the effect of parameter estimation error. We show that taking advantage of the assumed HMM structure allows for a substantial improvement of confidence bound sharpness over existing agnostic (structure-free) methods, as witnessed both via numerical experiments and real data examples.


15.00 : Van Hà Hoang (University of Science, Ho Chi Minh city)

Titre : Adaptive nonparametric estimation of a component density in a two-class mixture model

Résumé : A two-class mixture model, where the density of one of the components is known, is considered. We address the issue of the nonparametric adaptive estimation of the unknown probability density of the second component. We propose a randomly weighted kernel estimator with a fully data-driven bandwidth selection method, in the spirit of the Goldenshluger and Lepski method. An oracle-type inequality for the pointwise quadratic risk is derived as well as convergence rates over Hölder smoothness classes. The theoretical results are illustrated by numerical simulations.


Joint work with Gaëlle Chagny, Antoine Channarond and Angelina Roche.


16.00 : Etienne Roquain (Sorbonne Université)

Titre : Conformal inference with adaptive scores

Résumé : Conformal inference is a fundamental tool of statistics providing distribution-free guarantees. We consider here the transductive setting where decisions are made not only in one new data point but for $m$ such new points, which gives rise to a family of $m$ conformal $p$-values. While classical results only concern their marginal distribution, we provide theoretical insights on their joint distribution, including a DKW's type concentration inequality. In addition, these results hold for arbitrary exchangeable scores, including some adaptive ones that can use a part of the test sample. Our approach is then applied to provide new uniform guarantees for two machine learning tasks: interval prediction with transfer learning training and novelty detection with classification training.

Joint work with Ulysse Gazin and Gilles Blanchard