Séance du 28 septembre 2009

Lundi 28 septembre 2009

Organisateurs: Stéphane Gaïffas et Erwan Le Pennec

14h00 Gérard Biau ( LSTA-LPMA / UPMC)

Statistical Analysis of k-Nearest Neighbor Collaborative Recommendation

(joint work with Benoît Cadre (ENS Cachan, Antenne de Rennes) and Laurent Rouvière (ENSAI - Université Rennes 2))

Abstract: Collaborative recommendation is an information-filtering technique that attempts to present information items (movies, music, books, news, images, Web pages, etc.) that are likely of interest to the Internet user. Traditionally, collaborative systems deal with situations with two types of variables, users and items. In its most common form, the problem is framed as trying to estimate ratings for items that have not yet been consumed by a user. Despite wide-ranging literature, little is known about the statistical properties of recommendation systems. In fact, no clear probabilistic model even exists allowing us to precisely describe the mathematical forces driving collaborative filtering. To provide an initial contribution to this, we propose to set out a general sequential stochastic model for collaborative recommendation and analyze its asymptotic performance as the number of users grows. We offer an in-depth analysis of the so-called cosine-type nearest neighbor collaborative method, which is one of the most widely used algorithms in collaborative filtering. We establish consistency of the procedure under mild assumptions on the model. Rates of convergence and examples are also provided.

15h00 Francis Bach (INRIA - Ecole Normale Supérieure)

Méthodes parcimonieuses pour la factorisation de matrices

(en collaboration avec J. Abernethy, T. Evgeniou, R. Jenatton, J. Mairal, G. Obozinski, J. Ponce, G. Sapiro, J.-P. Vert)

Résumé: Les problèmes de factorisation de matrices se posent dans de nombreux domaines, comme le traitement d'images, le traitement de la parole ou la bioinformatique. Dans cet exposé, je présenterai deux contributions récentes qui étendent les méthodes parcimonieuses habituellement dédiées à la régression. Pour cela, deux types de parcimonie différents sur les matrices seront décrits: le premier type correspond aux décompositions de rang faible alors que le deuxième type correspond aux décompositions dont seulement l'une des deux composantes a beaucoup de composantes nulles (problème souvent appelé analyse en composantes principales parcimonieuses). Je présenterai des applications de ces méthodes dans les systèmes de recommandations et le traitement de l'image.

16h00 Jalal Fadili (GREYC / ENSICAEN)

Optimisation problems in compressed sensing: application to sparse recovery with quantization noise

Abstract: This talk would be divided in two parts. The first one will focus on several optimization problems involved in linear inverse problems where the solution is assumed to be sparsely represented in an overcomplete dictionary of waveforms. These problems are formalized within a unified framework of convex optimization theory, and invoke tools from convex analysis (e.g. duality, proximity operators) and maximal monotone operator splitting. Fast iterative splitting algorithms are proposed to solve them. This framework includes some previously proposed algorithms as a special case. In the second part, we study the problem of recovering sparse or compressible signals from uniformly quantized measurements. Toward this goal, a new class of convex optimization decoders is introduced, coined Basis Pursuit DeQuantizer of moment $p$ (BPDQ$_p$), that model the quantization distortion more faithfully than the commonly used BPDN or Lasso. We show that in oversampled situations, the performance of the BPDQ$_p$ decoders are significantly better than that of BPDN, with reconstruction error due to quantization divided by $\sqrt{p+1}$. This reduction relies on a modified Restricted Isometry Property of the sensing matrix expressed in the $\ell_p$-norm (RIP$_p$); a property satisfied by Gaussian random matrices with high probability. The challenging BPDQ$_p$ optimization program are solved by monotone operator splitting methods presented in the first part.