Abstracts and presentations

Esa Ollila: “An overview on the estimation of high-dimensional covariance and precision matrices”.

Abstract: For p-dimensional data, penalized (regularized) estimators of covariance matrix or precision matrix (inverse of the covariance matrix) are important when the sample size is small or modest relative to p. In this talk, we provide a selective review of several recent developments on regularized covariance matrix and precision matrix estimation. This overview will address cases such as heavy-tailed distributions and outlier contaminated scenarios, $g$-convex optimization, smooth and non-smooth penalty functions and methods for selecting the involved penalty or regularization parameter. We illustrate the usefulness of these methods in sensor array, financial economics and bioinformatics applications.

TalkOllila.pdf

Florent Chatelain et Raphaël Bacher: “Robust error control for large scale inference: detecting extragalactic structures”.

In this talk we present a new detection method of spatially extended target in images, with a global error control of this multiple testing pixel-wise problem. A generic framework to control in a robust way the False Discovery Rate (FDR) is discussed and a method that accounts for the the spatial structure of the targets is implemented. Results on simulated data show conclusive gains in detection power for a nominal control level. The method is also applied to the detection of extra-galactic structures on real data produced by the astronomical instrument MUSE.

TalkChatelain.pdf

Jialun Zhou: “Online estimation of MGGD: the Riemannian Averaged Natural Gradient method”.

Multivariate Generalized Gaussian Distributions (MGGD) are a rich class of multivariate distributions, which have gained importance across many engineering applications (image processing, computer vision, radar and biomedical signal processsing). Unfortunately, estimating the parameters of MGGD leads to non-linear matrix equations, whose solution becomes unpractical in high-dimensional problems, or when dealing with very large datasets. To overcome this difficulty, the present paper proposes a new method for online estimation of MGGD parameters, called the Riemannian Averaged Natural Gradient (RANG) method. The RANG method is suitable for application with high-dimensional and large datasets, since it requires modest memory and computational resources. The present paper formu- lates this new method, and presents some computer simulations, to showcase its performance. It is seen that, while the RANG method makes less exhaustive use of available data, it still achieves identical performance, to classical maximum-likelihood estimation, for sufficiently large datasets.

TalkZhou.pdf

Frédéric Pascal: "New insights into the statistical properties of M-estimators with application to PolSAR image denoising"

In signal processing applications, the knowledge of scatter matrix is of crucial importance. It arises in diverse applications such as filtering, detection, estimation or classification. Generally, in most of signal processing methods the data can be locally modelled by a multivariate zero-mean circular Gaussian stochastic process, which is completely determined by its covariance matrix. In that case, the classical covariance matrix estimator is the sample covariance matrix (SCM) whose behavior is perfectly known. Indeed, it follows the well-known Wishart distribution. Nevertheless, the complex normality sometimes presents a poor approximation of underlying physics. An alternative has been proposed by introducing elliptical distributions, namely the Complex Elliptically Symmetric distributions. In this context the SCM can perform very poorly and M-estimators appear as very interesting candidates, mainly due to their flexibility to the statistical model and their robustness to outliers and/or missing data. However, the behavior of such estimators still remains unclear and not well understood since they are described by fixed-point equations that make their statistical analysis very difficult. To fill this gap, the main contribution of this work is to prove that these estimators distribution is more accurately described by a Wishart distribution than by the classical asymptotic Gaussian approximation. Thanks to the theoretical results, we propose a new method for PolSAR image despeckling based on M-estimators.

TalkPascal.pdf

Tülay Adali: "Data Fusion Through Matrix and Tensor Decompositions: On Current Solutions, Challenges, and Prospects"

In many fields today, such as neuroscience, remote sensing, computational social science, and physical sciences, multiple sets of data are readily available. The datasets might either be multimodal where information about a given phenomenon is obtained through different types of acquisition techniques resulting in datasets with complementary information but essentially of different types, or multiset where the datasets are all of the same type but acquired from different samples, at different time points, or under different conditions.

Matrix and tensor factorizations enable joint analysis, i.e., fusion, of these multiple datasets such that they can fully interact and inform each other while also minimizing the assumptions placed on their inherent relationships. This talk presents an overview of the main models that have been successfully used for fusion of multiple datasets, in particular those that are based on independent component and vector analysis as well as canonical polyadic decomposition. Uniqueness of these decompositions under rather relaxed conditions makes them especially attractive for fusion tasks where the ultimate goal is the interpretation of the underlying components/factors. After a review of conditions for the identification of these models, important practical considerations in their implementation are highlighted using multiple examples with an emphasis on reproducibility. The main challenges and the opportunities in the area are also addressed

TalkAdali.pdf

Nicolas Le Bihan : “Asymptotic regime for improperness tests of complex random vectors”.

Improperness testing for complex-valued vectors and signals has been considered lately due to potential applications in complex-valued time series analysis encountered in many applications from communications to oceanography. This paper provides new results for such tests in the asymptotic regime, i.e. when the vector and sample sizes grow commensurately to infinity. The studied tests are based on invariant statistics named canonical correlation coefficients. Limiting distributions for these statistics are derived, together with those of the Generalized Likelihood Ratio Test (GLRT) and Roy’s test, in the Gaussian case. This characterization in the asymptotic regime allows also to identify a phase transition in Roy’s test with potential application in detection of complex-valued low-rank signals corrupted by proper noise in large datasets. Simulations illustrate the accuracy of the proposed asymptotic approximations.

TalkLeBihan.pdf

Filip Elvander: "Multi-Marginal Optimal Mass Transport with Partial Information"

During recent decades, there has been a substantial development in both optimal mass transport (OMT) theory and methods, as well as in the exploring of the use of this framework in numerous applications in engineering and economics. In this talk, we consider multi-marginal problems wherein only partial information of each marginal is available, a setup common in many inverse problems in, e.g., imaging and spectral estimation. Using illustrating examples from spatial spectral estimation, we show that multi-marginal OMT constitutes a versatile tool for addressing linear inverse problems by means of compact, convex formulations. Furthermore, by considering an entropy regularized approximation of the original problem, we propose an algorithm corresponding to a block-coordinate ascent of the dual, allowing for finding solutions to the OMT problem in a computationally efficient manner. The obtained framework is illustrated using problems from sensor fusion and tracking of dynamical spectra.

TalkElvander.pdf

Jean-Yves Tourneret and Marcelo Pereyra: “Bayesian Methods in Imaging Sciences”.

Modern imaging methods increasingly rely on the Bayesian statistical framework to solve challenging imaging problems. That is, they use stochastic models to represent the data observation process and the prior knowledge available, and they obtain solutions by using inference techniques stemming from Bayesian decision theory, delivering accurate and insightful results. Applying Bayesian strategies to imaging problems is not straightforward, and this drives the development of new methods and algorithms that tightly combine ideas from signal processing, stochastics, computational statistics, optimisation, numerical analysis, and beyond. This talk will present a range of exciting new developments in Bayesian analysis and computation methodology for solving imaging problems.

TalkTourneretPereyra.pdf

Malik Tiomoko: "Random Matrix Improved Estimation of a Large Class of Distances Between Covariance Matrices"

Many machine learning and signal processing applications, in fields as diverse as hyperspectral image or brain signal processing, rely on the statistical estimation of the distance between covariance matrices. In practice, standard estimates simply replace the unknown population covariances by sample covariances ideally obtained from numerous independent observations. However in modern applications where data are possibly few and large dimensional, those estimators are biased and induce dramatic approximation errors. In this article, based on advanced tools in random matrix theory, we provide consistent estimates of the distance between covariance matrices for a large family of metrics, with a particular emphasis on the popular Fisher distance. An application to covariance-based spectral clustering supports the strength of our estimators

TalkTiomoko.pdf

Nabil El Korso: “Learning with the Expectation-Maximization Algorithm: Array Processing Applications”.

1- The EM Algorithm: definition of the EM algorithm, toy example : Learning an optimal mixture of fixed models and tips on choosing the complete data

2- Multi-channel Signal Processing Applications:

  • Gaussian mixture learning with the EM Algorithm : Robust ML for DOA localization
  • Parallelization using the EM : Multi-source direction of arrival estimation
  • Robust mean inference with applications to robust graphical modeling of gene networks + robust calibration for large astronomical arrays
TalkELKORSO.pdf

Nora Ouzir: “Robust Similarity Measures for Motion Estimation in Ultrasound Images”.

In ultrasound imaging, motion estimation can be performed using similarity measures derived from a maximum likelihood perspective. A classical model is based on the Rayleigh multiplicative noise assumption. In this work, we introduce new robust similarity measures that take into account more realistic ultrasound scattering conditions, such as, varying speckle densities and shadowing. The deviations from the Rayleigh statistics are modelled using the t-distribution for radio-frequency signals and the Nakagami-gamma model for the echo amplitudes. Experiments using synthetic, phantom and in vivo data show an improvement in motion estimation accuracy in comparison with the similarity measures based on the Rayleigh model and the sum-of-absolute differences.

TalkNoraOuzir.pdf

Stefano Fortunati: “Robust detection for MIMO radars”.

Motivated by the recent developments of Massive MIMO paradigm in communication systems, in this talk we explore the potential benefits of having a very large number of antennas in MIMO radars. Particularly, we focus on target detection problem. We adopt a general MIMO radar signal model able to take into account the effects of i) the possible non-perfect orthogonality of the transmitted waveforms and ii) a spatially and temporally correlated disturbance with unknown distribution. Then, building upon the results on robust statistic with dependent data, we develop a robust Wald-type test that guarantees certain detection performance irrespective of the, generally unknown, disturbance distribution. This is achieved by exploiting the spatial degrees of freedom offered by a MIMO radar equipped with a large number of antennas. Closed-form expressions for the level of significance (i.e. the probability of false alarm) and for the power (i.e. the probability of detection) of the proposed robust Wald-test are derived. Finally, numerical results are shown to validate the asymptotic analysis in the finite system regime.

TalkFortunati.pdf

Michael Muma: “Robust Solution Path (RSP) Estimation for High-Dimensional Regression Problems”.

We investigate the solution path and the variable selection properties of regularized least squares regression estimators in high-dimensional settings. These are settings where the number of variables exceeds the number of observations. We consider the case when the regression matrix is contaminated by outliers. We propose a framework called Robust Solution Path (RSP) estimation that separates the solution path into a robust path containing only non-contaminated variables and a non-robust path that might contain some outlier contaminated variables. This allows us to pick a robust parameter vector from the robust path. We propose and analyze two new methods: The RSP-Lasso and the RSP-elastic net. We prove that our estimators fulfill the RSP conditions. Both of our methods can be efficiently computed with a modified version of the LARS algorithm. We also propose the RSP-Bayesian Lasso and the RSP-Bayesian elastic net and prove for the former that its posterior distribution is unimodal. This is a desirable property, since it allows for meaningful mean and median estimates of the posterior distribution and a fast convergence of MCMC methods such as the Gibbs sampler. Simulations demonstrate that the proposed methods compare favorably to existing regularized robust and least-squares estimators. When selecting variables from potentially outlier contaminated data sets, the proposed methods can help scientists and practitioners to report their findings with more confidence in the selected model.

TalkMuma.pdf

Elias Raninen: "Regularized sample covariance matrix estimators for multiple classes"

This talk considers the estimation of covariance matrices of multiple classes with limited training data. The sample covariance matrix (SCM) is known to perform poorly when the number of variables is large compared to the available number of samples. In order to reduce the mean squared error (MSE) of the estimator, regularized (shrinkage) SCM estimators are often used. In this work, we consider regularized SCM estimators for multiclass problems that combine two different target matrices for regularization: the pooled (average) SCM of the classes and the scaled identity matrix. Regularization towards the pooled SCM is beneficial when the population covariances are similar whereas regularization towards the identity matrix guarantees that the estimators are positive definite. We derive the MSE optimal regularization parameters as well as propose a method for their estimation under the assumption that the class populations follow unknown (unspecified) elliptical distributions. The performance of the estimators is demonstrated via synthetic data simulations as well as in an application in discriminant analysis classification and in a global minimum variance portfolio (GMVP) optimization problem using historical stock data.

TalkRaninen.pdf