Joint Seminar on Inverse Problems and Learning Theory

How to join the seminar

Our meetings are online and broadcasted via Zoom. Upcoming meetings will be announced in due time.

Next talk: TBA

Former Talks

Tuesday, November 25th, 2025

Learning Multi-Index Models with Hyper-Kernel Ridge Regression

Shuo Huang (Istituto Italiano di Tecnologia, Genoa, Italy)

Deep neural networks excel in high-dimensional problems, outperforming models such as kernel methods, which suffer from the curse of dimensionality. However, the theoretical foundations of this success remain poorly understood. We follow the idea that the compositional structure of the learning task is the key factor determining when deep networks outperform other approaches. Taking a step towards formalizing this idea, we consider a simple compositional model, namely the multi-index model (MIM). In this context, we introduce and study hyper-kernel ridge regression (HKRR), an approach blending neural networks and kernel methods. Our main contribution is a sample complexity result demonstrating that HKRR can efficiently learn MIM, overcoming the curse of dimensionality. Further, we exploit the kernel nature of the estimator to develop ad hoc optimization approaches. Indeed, we contrast alternating minimization and alternating gradient methods both theoretically and numerically. These numerical results complement and reinforce our theoretical findings.

Tuesday, July 8th, 2025

Collaborative Likelihood-Ratio Estimation over Graphs

Alejandro de la Concha (Centre Borelli, ENS Paris-Saclay, Université Paris-Saclay, Gif-sur-Yvette, France)

Density ratio estimation is an elegant approach for comparing two probability measures P and Q, relying solely on i.i.d. observations from these distributions and making minimal assumptions about P and Q. In the first part of the talk, we introduce a graph-based extension of this problem, where each node of a fixed graph is associated with two unknown node-specific probability measures, P_v and Q_v, from which we observe samples. Our goal is to estimate, for each node, the density ratio between the corresponding densities while leveraging the information provided by the graph structure. We develop this idea through a concrete non-parametric method called GRULSIF.

A key feature of collaborative likelihood-ratio estimation is that it enables a straightforward derivation of test statistics to quantify differences between the node-level distributions P_v and Q_v. In the second part of the talk, we present a non-parametric, graph-structured multiple hypothesis testing framework named collaborative non-parametric two-sample testing, which has potential applications in spatial statistics and neuroscience.

Based on:

1. de la Concha, A., Vayatis, N., & Kalogeratos, A. (2024). Collaborative likelihood-ratio estimation over graphs [arXiv preprint arXiv:2205.14461]. arXiv. https://arxiv.org/abs/2205.14461

2. de la Concha Duarte, A. D., Vayatis, N., & Kalogeratos, A. (2025). Collaborative non-parametric two-sample testing. Proceedings of the 28th International Conference on Artificial Intelligence and Statistics (AISTATS) (Vol. 258, pp. 838–846). PMLR. https://proceedings.mlr.press/v258/concha-duarte25a.html

Tuesday, May 20th, 2025

Theory of data-driven parameter choice rules and acceleration methods for the regularization

of inverse problems

Stefan Kindermann (JKU Linz, Austria)

In the first part of the talk we outline some theoretical background of data-driven (or heuristic) rules for selecting the regularization parameter in regularization schemes of (linear) inverse problems. We discuss the famous negative result of Bakushinskii and we contrast this with the more recent convergence theory in the restricted noise case using noise conditions.

In the second (independent) part we investigate the aggregation method, which is an acceleration method for Tikhonov regularization. We may understand this and related methods in the framework of rational Krylov methods. This allows one to investigate their convergence properties and under which conditions they act as regularization methods.

Tuesday, January 14th, 2025

Learning sparsity-promoting regularizers for inverse problems

Tapio Helin (LUT, Finland)

In this talk, I discuss a bilevel optimization framework for learning sparsity-promoting regularizers in linear inverse problems. I will outline some theoretical guarantees and demonstrate the framework’s flexibility through a couple of examples.

Tuesday, October 29th, 2024

Optimal rates for regularized learning in infinite dimensions

Mattes Mollenhauer (Merantix Momentum)

We give an overview of recent progress in the analysis of spectral regularization algorithms for regression problems with infinite-dimensional response variables. We impose source conditions by generalizing the interpolation space norms (Steinwart et al. 2009, Steinwart and Fischer 2020) to the vector-valued setting via a tensor product trick—allowing to prove convergence also in misspecified model settings. For typical vector-valued kernels, we show that the resulting interpolation spaces coincide with Bochner-Sobolev spaces. Our results provide rates for a variety of applications such as the conditional mean embedding, kernel functional regression and operator learning.

Based on

Li, Z., Meunier, D., M., & Gretton, A. (2022). Optimal rates for regularized conditional mean embedding learning. Advances in Neural Information Processing Systems, 35, 4433-4445.

Li, Z., Meunier, D., M., & Gretton, A. (2024). Towards Optimal Sobolev Norm Rates for the Vector-Valued Regularized Least-Squares Algorithm. Journal of Machine Learning Research, 25(181), 1-51.

Meunier, D., Shen, Z., M.., Gretton, A., & Li, Z. (2024). Optimal Rates for Vector-Valued Spectral Regularization Learning Algorithms. Advances in Neural Information Processing Systems (to appear).

Tuesday, July 9th, 2024

Convergence of Randomized Kaczmarz Algorithms in Hilbert Spaces

Xin Guo (UNI Queensland, Australia)

The Kaczmarz algorithm was first introduced in 1937 to solve large systems of linear equations. Existing works on the convergence analysis of the randomized Kaczmarz algorithm typically provide exponential rates of convergence, with the base tending to one as the condition number of the system increases. Results of this kind do not work well for large systems of linear equations, and do not apply to the online algorithms on Hilbert spaces for machine learning. In this talk, we provide a condition number-free analysis, which leads to polynomial rates of weak convergence for the randomized Kaczmarz algorithm. We also show the applications to kernel-based machine learning.

Tuesday, June 18, 2024

Learning regularizers - bilevel optimization or unrolling?

Dirk Lorenz (UNI Bremen, Germany)

In this talk we will consider the problem of learning a convex regularizer from a theoretical perspective. In general, learning of variational methods can be done by bilevel optimization where the variational problem is the lower level problem and the upper level problem minimizes over some parameter of the lower level problem. However, this is usually too difficult in practice and a popular method is the approach by so-called unrolling (or unfolding) a solver for the lower level problem. There, one replaces the lower level problem by an

algorithm that converges to a solution of that problem, chooses a number N of iterations to be performed and uses the N-th iterate as a substitute for the true solution.While this approach is often successful in practice, little theoretical results are available. In this talk we will consider a situation in which one can make a thorough comparison of the bilevel approach and the unrolling approach in a particular case of a quite simple toy example. Even though the example is quite simple, the situation is already complex and reveals a few phenomena that have been observed in practice: Deeper unrolling is often not beneficial, especially if algorithm parameters such as stepsizes are not learned as well. With learned stepsizes, deeper unrolling often does not improve performance, but gives good results already for shallow unrolling.

Tuesday, March 19, 2024

Learning Convolution Operators with Kernel Methods

Ernesto DeVito (Uni Genoa, Italy)

There is a growing interest in the community regarding the problem of learning operators. In collaboration with Emilia Magnani, Philipp Hennig, and Lorenzo Rosasco, this talk focuses on convolution operators, which have a significant role in signal and image processing, system identification, and partial differential equations. We approach the problem of learning convolution operators as a functional regression problem, where the covariates are linear operators. In this setting, we assume that the function of interest belongs to a reproducing kernel Hilbert space and consider a natural kernel ridge regression estimator. The accuracy of the proposed estimator is characterized in terms of finite sample bounds. We show that classical regularity assumptions have a novel and natural interpretation in this context.

On optimal approximation based on random samples

Mario Ullrich (JKU Linz, Austria)

We study the complexity of learning/approximation of functions from a given class based on function evaluations (aka. samples), with emphasis on a worst-case setting. That is, we ask for the minimal amount of data needed by a (deterministic) algorithm with an "error guarantee" for all .
It turned out that a suitable least squares method based on i.i.d. samples from a specific density is in many cases as good as, or even better than, all known, sophisticated constructions, also if we measure the error in the uniform norm.
In this talk I'll introduce all the needed concepts, and try to survey this area of research from the perspective of Approximation theory and Information-based Complexity.

Tuesday, January 23, 2024

Inverse learning in Hilbert scales

Abishake Rastogi (Uni LUT, Finland)

We will discuss the ill-posed inverse problems with noisy data in the framework of statistical learning. The corresponding operator equation is assumed to fit a given Hilbert scale, generated by some unbounded self-adjoint operator. Approximate reconstructions from random noisy data are obtained with general regularization schemes in such a way that these belong to the domain of the generator. The analysis has thus to distinguish two cases, the regular one, when the true solution also belongs to the domain of the generator, and the ‘oversmoothing’ one, when this is not the case. Rates of convergence for the regularized solutions will be expressed in terms of certain distance functions. For solutions with smoothness given in terms of source conditions with respect to the scale-generating operator, the error bounds can then be made explicit in terms of the sample size.

Data driven regularization

Andrea Aspri (Uni Milan, Italy)

I will introduce data-driven adaptations of classical regularization methods employed in addressing ill-posed inverse problems. The initial segment of the talk will delve into data-driven variations of the Iteratively Regularized Landweber iteration method, tailored for both linear and nonlinear inverse problems. Two distinct strategies are explored: the first closely resembles the traditional Iteratively Regularized Landweber method, incorporating the average or geometric mean of available data as a regularization term. The second approach integrates training data to estimate the interior of a black box, guiding the iteration process.
The second part of the talk deals with purely data-driven regularization approaches based on projection methods onto finite-dimensional space. This is particularly relevant for linear inverse problems where the forward operator is not explicitly known but is indirectly defined through input-output training pairs. I will provide specific details regarding theoretical results on the convergence and stability of the various methods. These theoretical insights will be substantiated by numerical experiments, with a particular emphasis on those involving the Radon transform.

Tuesday, November 21, 2023

Non-linear functional regression

Sergei Pereverzyev (RICAM)

Functional Regression (FR) involves data consisting of a sample of functions taken from some population. Most work in FR is based on a variant of the functional linear model first introduced by Ramsay and Dalzell in 1991. A more general form of polynomial functional regression has been introduced only quite recently by Yao and Müller (2010), with quadratic functional regression as the most prominent case. A crucial issue in constructing FR models is the need to combine information both across and within observed functions, which Ramsay and Silverman (1997) called replication and regularization, respectively. In this talk we are going to present a general approach for the analysis of regularized polynomial functional regression of arbitrary order and indicate the possibility for using here a technique that has been recently developed in the context of supervised learning. Moreover, we are going to describe of how multiple penalty regularization can be used in the context of FR and demonstrate an advantage of such use. Finally, we briefly discuss the application of FR in stenosis detection.

Convex regularization in statistical inverse learning problems

Luca Ratti (University of Bologna)

We consider a problem at the crossroad between inverse problems and statistical learning: namely, the estimation of an unknown function from noisy and indirect measurements, which are only evaluated at randomly distributed design points. This occurs in many contexts in modern science and engineering, where massive data sets arise in large-scale problems from poorly controllable experimental conditions. When tackling this task, a common ground between inverse problems and statistical learning is represented by regularization theory, although with slightly different perspectives. In this talk, I will present a unified approach, leading to convergence estimates of the regularized solution to the ground truth, both as the noise on the data reduces and as the number of evaluation points increases. I will mainly focus on a class of convex, p-homogeneous regularization functionals (p being between 1 and 2), which allow moving from the classical Tikhonov regularization towards sparsity-promoting techniques. Particular attention is given to the case of Besov norm regularization, which represents a case of interest for wavelet-based regularization. The most prominent application I will discuss is X-ray tomography with randomly sampled angular views. I will finally sketch some connections with recent extensions of our approach, including a more general family of sparsifying transforms and dynamical inverse problems.

Page updated

Google Sites

Report abuse