Séance du 20 octobre 2008

Lundi 20 octobre 2008

Organisateurs: Judith Rousseau et Jean-Yves Audibert

14h00 Francis Bach (INRIA - Ecole Normale Supérieure)

Model Consistent Lasso Estimation through the Bootstrap

Abstract: We consider the least-square linear regression problem with regularization by the L1-norm, a problem usually referred to as the Lasso. We present a detailed asymptotic analysis of model consistency of the Lasso. For various decays of the regularization parameter, we compute asymptotic equivalents of the probability of correct model selection (i.e., variable selection). For a specific rate decay, we show that the Lasso selects all the variables that should enter the model with probability tending to one exponentially fast, while it selects all other variables with strictly positive probability. This property implies that if we run the Lasso for several bootstrapped replications of a given sample, then intersecting the supports of the Lasso bootstrap estimates leads to consistent model selection. This novel variable selection algorithm, referred to as the Bolasso, is compared favorably to other linear regression methods on synthetic data and datasets from the UCI machine learning repository.

15h00 Arnak Dalalyan (CERTIS, École des Ponts-ParisTech)

Réduction de dimension dans le modèle de régression

Résumé: Dans un modèle de régression, où on s'intéresse à l'explication et à la prédiction d'une variable dite réponse à l'aide d'une variable explicative multidimensionnelle, la performance des procédures statistiques se détériore rapidement lorsque la dimension de la variable explicative augmente. Une approche permettant d'atténuer, dans une certaine mesure, ce fléau de la dimension sans avoir recours à des hypothèses excessivement contraignantes consiste à se restreindre aux modèles à un nombre fixé de directions révélatrices. L'estimation de ces directions à partir d'un échantillon est le problème statistique auquel on s'est intéressé. L'objectif de cet exposé est de présenter une nouvelle procédure d'estimation des directions révélatrices basée sur l'idée d'adaptation structurelle. Nous donnerons des éléments de comparaison de notre procédure avec les principales procédures existantes, aussi bien à travers de leurs propriétés théoriques qu'à travers des simulations.

16h00 Nicolas Chopin (ENSAE-ParisTech)

Bayesian model choice in hidden Markov models, change point models, and beyond

Abstract: In Chopin (2007), I proposed to rewrite hidden Markov models so as to label components by order of appearance. In this talk, I discuss how this rewriting allows for estimating the number of components that have appeared up to time t, and how this relates to Bayesian model choice. I'll discuss how to extend such ideas to other classes of models, such as change point models, semi-Markov models, and continuous-time Markov-switching models. Computational difficulties with Bayesian model choice will be briefly evoked, with some emphasis on particle filtering, which seems to be very promising in such a set-up.