Lieu : IHP, amphi Yvonne Choquet-Bruhat (second étage du bâtiment Perrin)
14.00 : Pierre Humbert (CNRS et Université d'Evry)
Titre : Transductive conformal inference for full ranking
Résumé : In this presentation, we will study a method based on conformal prediction for quantifying the uncertainty of ranking algorithms. We will focus on the following scenario: among $n + m$ objects to be ranked, the relative rank of the first $n$ is known. Our goal is then to evaluate the ranking error of the remaining $m$ objects. To achieve this, we will use conformable prediction techniques. However, as we will show, existing methods are not suited to this particular framework. We will therefore propose a new approach, analyzing both its statistical properties and empirical performance.
Paper: https://arxiv.org/pdf/2501.11384? with JB. Fermanian, P. Humbert, and G. Blanchard
15.00 : Hugo Cui (CNRS et Université Paris-Saclay)
Titre : High-dimensional analysis of a single-layer attention for sparse token classification
Résumé : When and how can an attention mechanism learn to selectively attend to informative tokens, thereby enabling detection of weak, rare, and sparsely located features? We address these questions theoretically in a sparse-token classification model in which positive samples embed a weak signal vector in a randomly chosen subset of tokens, whereas negative samples are pure noise. In the long-sequence limit, we show that a simple single-layer attention classifier can in principle achieve vanishing test error when the signal strength grows only logarithmically in the sequence length L, whereas linear classifiers require sqrt(L) scaling. Moving from representational power to learnability, we study training at finite in a high-dimensional regime, where sample size and embedding dimension grow proportionally. We prove that just two gradient updates suffice for the query weight vector of the attention classifier to acquire a nontrivial alignment with the hidden signal, inducing an attention map that selectively amplifies informative tokens. We further derive an exact asymptotic expression for the test error and training loss of the trained attention-based classifier, and quantify its capacity -- the largest dataset size that is typically perfectly separable -- thereby explaining the advantage of adaptive token selection over nonadaptive linear baselines. Joint work with Nicholas Barnfield and Yue M Lu.
16.00 : Thomas Bonis (Université Gustave Eiffel)
Titre : A non-optimal yet practical transport plan via diffusions
Résumé : Transporting the Gaussian measure to another target measure is at the core of many theoretical, such as rates in the Central Limit Theorem, and applied works, in particular modern generative algorithms. While the most natural transport plan to consider is the optimal one, it can be difficult to work with as it depends globally on the target measure. In this talk I will present another transport plan defined from a diffusion process which, while not optimal, is much more practical as it is entirely described by a set of functions called "score functions" which are purely local quantities. I will finally present recent works on estimating these score functions from a sample of the target measure.