Material

Abstracts and Presentations

Lectures

## 1

Title: Space-sequential particle filters for high-dimensional dynamical systems described by stochastic differential equations

Speaker: Joaquín Míguez, UC3M, Spain

Abstract: We introduce a novel methodology for particle filtering in dynamical systems where the evolution of the signal of interest is described by a SDE and observations are collected instantaneously at prescribed time instants. The new approach includes the discretisation of the SDE and the design of efficient particle filters for the resulting discrete-time state-space model. The discretisation scheme converges with weak order 1 and it is devised to create a sequential dependence structure along the coordinates of the discrete-time state vector. We then discuss a class of space-sequential particle filters that exploit this structure to improve performance when the system dimension is large. This is numerically illustrated by a set of computer simulations for a stochastic Lorenz 96 system with additive noise. The new space-sequential particle filters attain approximately constant estimation errors as the dimension of the Lorenz 96 system is increased, with a computational cost that increases polynomially, rather than exponentially, with the system dimension. This is joint work with Deniz Akyildiz and Dan Crisan (Imperial College London).


## 2

Title: Signal Processing with Large Dimensional Observations: a Random Matrix Theory Approach

Speaker: Xavier Mestre, CTTC, Spain

Abstract: Conventional tools in array signal processing have traditionally relied on the availability of a large number of samples acquired at each sensor or array element. Large sample size assumptions typically guarantee the consistency of estimators, detectors, classifiers and multiple other widely used signal processing procedures. However, practical scenario and array mobility conditions, together with the need for low latency and reduced scanning times, impose strong limitations on the total number of observations that can be effectively processed. When the number of collected samples per sensor is small, conventional large sample asymptotic approaches are not relevant anymore. The goal of this talk is to introduce a unifying framework for the study and design of array signal processing techniques under the constraint of a small number of observations per sensor. The main idea is to revisit the design of classical array processing techniques by exploiting latest advances in large random matrix theory. We will review some of the most celebrated results in this field and we will introduce some recent advances related to covariance clustering/classification as well as multivariate time series.


## 3

Title: High-Order Portfolios: The Role of Heavy Tails and Skewness

Speaker: Daniel Palomar, HKUST, Hong Kong

Abstract: Markowitz’s mean-variance portfolio formulation considers a trade-off between the expected return and the risk measured by the variance. However, since financial data is not Gaussian distributed and shows asymmetry and heavy tails, it makes sense to also incorporate higher-order moments. Unfortunately, designing a portfolio based on the first four moments (i.e., mean, variance, skewness, and kurtosis) brings some critical difficulties such as the computation and storage complexity order and the non convexity of the portfolio formulation. We will explore the evolution of high-order portfolios over more than half a century.


## 4

Title: Opportunistic Joint Sensing and Communications

Speaker: Marco Lops, University of Napoli, Italy

Abstract: In communications-enabled sensing, of particular interest in the millimeter Waves (mmWave) bandwidth - wherein radar and communications tend to be similar in both channel characteristics and signal processing - a radar receive chain is co-located with a communications transmitter, possibly sharing information, and suitably processes the backscattered signals in order to undertake short-range sensing of the environment. Conversely, radar-enabled communications rely on ambient backscattering, produced by low-complexity and nearly zero-consumption devices, in order to sustain communication links: among the enabling factors for these architectures we find the recently developed technology of Reconfigurable Intelligent Surfaces (RIS), which are low-consumption, intelligent mirrors capable of redirecting astray rays towards a desired point. The talk reviews some recent results concerning these two complementary paradigms, both inspired by the Integrated Sensing and Communications (ISAC) philosophy.


## 5

Title: Controlled Discovery and Localization of Signals via Bayesian Linear Programming (BLiP)

Speaker: Lucas Janson, Harvard University, USA

Abstract: In many high-dimensional statistical problems, it is necessary to simultaneously discover signals and localize them as precisely as possible. For instance, genetic fine-mapping aims to discover causal genetic variants, but the strong local dependence structure of the genome makes it hard to identify the exact locations of those variants. So the statistical task is to output as many regions as possible and have those regions be as small as possible, while controlling how many outputted regions contain no signal. The same type of problem arises in any signal discovery application where signals cannot be perfectly localized, such as locating stars in astronomical sky surveys and change-point detection in time series. However, there are two competing objectives: maximizing the number of discoveries and minimizing the size of those discoveries (all while controlling false discoveries), so our first contribution is to propose a single unified measure we call the resolution-adjusted power that formally trades off these two objectives and hence, at least in principle, can be maximized subject to a constraint on false discoveries. We take a Bayesian approach, but the resulting constrained posterior optimization over candidate discovery regions is nonconvex and extremely high-dimensional. Thus our second contribution is Bayesian Linear Programming (BLiP), which uses linear programming to find a feasible solution (i.e., it controls false discoveries) that verifiably nearly maximizes the expected resolution-adjusted power. BLiP is remarkably computationally efficient and can wrap around any Bayesian model and algorithm for approximating the posterior distribution over signal locations. Applying BLiP on top of existing state-of-the-art Bayesian analyses of UK Biobank data (for genetic fine-mapping) and the Sloan Digital Sky Survey (for astronomical point source detection) increased the resolution-adjusted power by 30-120% with just a few minutes of computation. BLiP is implemented in the new packages pyblip (Python) and blipr (R). This is joint work with Asher Spector (Stanford).


## 6

Title: Learning without labels on multivariate biosignals: From unsupervised to self-supervised learning

Speaker: Alexandre Gramfort, INRIA, France

Abstract: Machine learning has revolutionized many scientific fields over the last decade by relying mostly on supervised learning. Although very intuitive, the supervised approach is now challenged in core applications of ML: vision, speech but even more massively NLP with modern language models. Biomedical applications, where acquiring labels is often costly or even subjective, provide a pressing opportunity to move away from the supervised way. In this talk, I will present recent contributions towards this goal. I will start by presenting a first line of works building on old ideas of latent factor models with independence assumptions. Starting from single subject to multiview/multisubjects models, I will explain how problem structure can be used to obtain statistical guarantees and fast algorithms leveraging the sparse structure of the Hessian or the solutions. I will then explain how self-supervised learning tasks can be designed to be relevant on electroencephalography (EEG) data during sleep, and how it can reveal interesting latent structures in the data.

Short Talks

## 1

Title: Mixed-spectrum signals – Covariance, Spectral Estimation, and Implications for Array Processing

Speaker: Filip Elvander, Aalto University, Finland

Abstract: The estimation of the covariance function of a stochastic process, or signal, is of integral importance for a multitude of signal processing applications. For example, in array processing, the array covariance matrix is used for inferring source-to-sensor delays in order to, e.g., localize signal sources or to perform interference suppression. Often, estimation algorithms are derived under the assumption of narrow-band, i.e., sinusoidal, sources and of access to statistically independent array snapshots, leading to well-behaved estimates of inter-sensor covariances. In this talk, we study the consequences of deviations from these assumptions. In particular, we present closed-form expressions for the covariance of covariance estimates for mixed-spectrum continuous-time signals, i.e., spectra containing both absolutely continuous and singular parts. Furthermore, we consider approximating signals with arbitrary spectral densities by sequences of singular spectrum, i.e., sinusoidal processes, and show that the limiting behavior of covariance estimates as both the sample size and the number of sinusoidal components tend to infinity can be described by a time-frequency resolution product. The impact of the theoretical results on practical scenarios of spatial spectral estimation is demonstrated in examples of direction-of-arrival estimation.


## 2

Titre: Fusion of MRI and ultrasound images for endometriosis surgery

Speaker: Jean-Yves Tourneret, INP Toulouse, France

Abstract: Endometriosis is a gynecologic disorder that typically affects women in their reproductive age and is associated with chronic pelvic pain and infertility. In the context of preoperative diagnosis and guided surgery, endometriosis is a typical example of pathology that requires the use of both magnetic resonance (MRI) and ultrasound (US) modalities. These modalities are used side by side because they contain complementary information. However, MRI and US images have different spatial resolutions, fields of view, and contrasts and are corrupted by different kinds of noise, which results in important challenges related to their analysis by radiologists. The fusion of MR and US images is a way of facilitating the task of medical experts and improve the pre-operative diagnosis and surgery mapping. This talk will summarize some recent works on a new automatic fusion method for MRI and US images investigated in IRIT laboratory. This method combines the advantages of each modality, i.e., good contrast and signal to noise ratio for the MRI image and good spatial resolution for the US image. Experiments conducted on synthetic and experimental phantom images will illustrate the performance of the method.

## 3

Title: An Intrinsic McAulay-Seidman Bound for Parameters Evolving on Matrix Lie Groups

Speaker: Samy Labsir, ISAE-SUPAERO, France

Abstract : Lower bounds on the mean square error (MSE) are of fundamental importance to know the ultimate achievable estimation performance of any unbiased estimator. Even if the Cramér-Rao bound (CRB) is the most popular one, mainly due to its simplicity of calculation, other bounds are of interest in several applications. In this communication, we derive a new intrinsic McAulay-Seidman bound (IMSB) for the estimation of unknown deterministic parameters lying on Lie groups, which generalize known results on the intrinsic CRB. The validity of the proposed IMSB is shown for the Gaussian observation model with unknown deterministic parameters belonging to SO(3) by comparing the IMSB with the intrinsic MSE.

## 4

Title: Revisiting State Estimation with State and/or Filter Equality Constraints

Speaker: Eric Chaumette, ISAE-SUPAERO, France

Abstract: This work revisits optimal state estimation, in the mean squared error matrix sense, for linear systems with state and/or filter subject to linear equality constraints (LECs). First, it is shown that the conventional Wiener filter (WF) form incorporates any LECs on the state, thus yielding a filter subject to the same LECs. Conversely, an optimal linear filter subject to LECs (or linear equality gain constraints) in general does not exist. Therefore, adding LECs on the WF or WF gain matrix either leaves unchanged (best case) or degrades the constrained WF performance w.r.t. the unconstrained WF. Since the Kalman filter (KF) and Kalman predictor (KP) are recursive WF forms for linear discrete state-space (LDSS) systems, the same results hold for both estimators, which is in contradiction with several existing results in the literature. Actually, even if these existing results are mathematically correct, however, they have been derived for unsuitable assumed LDSS models where the state is surprisingly not compliant with the assumed LECs. Indeed, it is shown that for suitable assumed state models, both standard KF and KP forms satisfy the assumed LECs, making any additional projection step superfluous.


## 5

Title: Multifractal Anomaly Detection in Images Via Space-Scale Surrogates

Speaker: Lorena León, University of Toulouse, France

Abstract: Multifractal analysis provides a global description for the spatial fluctuations of the strengths of the pointwise regularity of image amplitudes. A global image characterization leads to robust estimation, but is blind to and corrupted by small regions in the image whose multifractality differs from that of the rest of the image. Prior detection of such zones with anomalous multifractality is thus crucial for relevant analysis, and their delineation of central interest in applications, yet has never been achieved so far. The goal of this work is to devise and study such a multifractal anomaly detection scheme. Our approach combines three original key ingredients: i) a recently proposed generic model for the statistics of the multiresolution coefficients used in multifractal estimation (wavelet leaders), ii) an original surrogate data generation procedure for simulating a hypothesized global multifractality and iii) a combination of multiple hypothesis tests to achieve pixel-wise detection. Numerical simulations using synthetic multifractal images show that our procedure is operational and leads to good multifractal anomaly detection results for a range of target sizes and parameter values of practical relevance.


## 6

Title: Robust low rank and sparse matrix decomposition for through the wall radar imaging

Speaker: Hugo Brehier, CentraleSupélec, France

Abstract: Through the wall radar imaging, in its most common setting, aims at detecting targets inside a closed room from its outside. Using SAR imaging in a chosen frequency band, we can apply sparse and low rank decomposition methods to retrieve the target positions. This new work builds upon our previous work by adding a robust cost function to better handle outliers from the measurements. A new optimization problem is introduced and solved via two differents ways and compared via simulations.

## 7

Title: Generative Models and Role of Deep Neural Networks

Speaker: Saikat Chatterjee, KTH, Sweden

Abstract: Generative models can explain the process of generation for signals/data. Gaussian distribution, Fourier Transform, Wavelets are examples of generative models. They are widely used across many fields. If we know the parameters of a Gaussian distribution then we know two important aspects: (a) How to generate a synthetic sample? (b) What is the likelihood of a real sample in the sense of probability? In this seminar, we will explore the use of deep neural networks (DNNs) for generative models. We will first discuss a fundamental structure of neural networks, and then explain use of DNNs for generative adversarial networks (GANs), Normalizing Flows, etc. Included applications are interpolation, pattern recognition, dynamical system design.

## 8

Title: Estimation of Extended Target Impulse Response of Unknown Size

Speaker: Corentin Lubeigt, ISAE-SUPAERO, France

Abstract: When a signal is strongly distorted by a reflecting surface, the surface can be seen as a filter whose impulse response is convoluted with the incident signal. Depending on the application it can be useful to estimate this impulse response in order to either compensate or interpret it. When it comes to estimation, the performance lower bounds should be computed in order to better understand the limits of the model. After a brief presentation of the context and the signal model, this presentation outlines the main steps to derive an easy-to-use closed-form Cramér-Rao bound and to validate it using asymptotic properties of the maximum likelihood estimator. The validation process of these bounds raises the problem of the size, generally unknown, of the impulse response to estimate. Consequently, a second part of the presentation will provide adapted tools to determine the size of a given impulse response along its estimation.

## 9

Title: On the detection of off-grid targets

Speaker: Pierre Develter, ONERA, France

Abstract: In detection, unknown signal parameters are usually dealt with Generalized Matched Filter Test (GLRT). It replaces the unknown parameters by their Maximum Likelihood Estimates (MLE) in the Likelihood Ratio Test. Under certain hypotheses, it leads to the well-known Matched Filter (MF) or Normalized Matched Filter (NMF) test. When MLE estimates are not available, for ease of implementation the parameters are supposed to lie on a discrete grid. However, in practice, the target parameters will not exactly fall on the grid. This decreases the performance of the detection tests. In my talk, I will present this off-grid context before discussing some considerations on the off-grid MF and NMF tests (off-grid GLRTs) where the unknown parameters are maximized. We show how to approximate those tests thanks to a monopulse approach, and how to derive their statistical PFA-threshold relationship.

## 10

Title: Finding the oracle solution

Speaker: Marcus Carlsson, Lund University, Sweden

Abstract: The main workhorse in compressive sensing is the LASSO algorithm, which under certain assumptions converges to a point near the so called the "oracle solution". But are these assumptions fulfilled in a standard application? I will first discuss this and then present some non-convex methods that can actually find the oracle solution.

## 11

Title: T-Rex Sparse PCA

Speaker: Jasin Machkour, TU Darmstadt, Germany & Arnaud Breloy, University Paris Nanterre, France

Abstract: Principal component analysis (PCA) is a popular unsupervised method that is commonly used in dimension reduction. This is because a few principal components (PCs) often suffice to explain the essence of the original p-dimensional samples. Different variants of PCA exist: while the “vanilla” PCA seeks to maximize the variance captured by the projection onto a linear subspace of dimension k<p, generalizations of this approach come in many flavors. Among them, Sparse PCA (SPCA) performs the double duty of variable selection and dimension reduction by requiring the low-dimensional basis of the subspace to be sparse (that is, creating PCs that are linear combinations of only few variables). However, unfortunately, when SPCA is used for variable selection, the amount of explained variance may not be a reliable measure. Especially in high-dimensional settings where the number of variables (p) is large compared to the sample size (n), randomly correlated variables may be selected. To address such problems, this talk presents a false discovery rate (FDR) controlled globally sparse PCA (i.e., shared support among the PCs). We start from the seminal cast of SPCA as a series of elastic-net problems. This allows us to leverage the recently developed T-Rex selector to perform an efficient variable selection while controlling the FDR. Specifically, T-Rex SPCA operates by fusing the support of multiple early terminated random experiments, which are conducted on a combination of the original predictors and multiple sets of randomly generated dummy variables. While the method is broadly applicable, an example from genetics is used to illustrate the FDR-controlled T-Rex SPCA approach.

## 12

Title: Linear and nonlinear multiview analysis with application in epilepsy

Speaker: Tanuj Hasija, Paderborn University, Germany

Abstract: Multiview analysis is essential in various fields like biomedicine, image processing, robotics and wearable technology. Canonical correlation analysis (CCA) is one of the most common tools for analysing second-order association and extracting shared factors in data from two different views. This presentation talks about CCA and its variants for linear and nonlinear multiview analysis with a focus on identifiability of the shared subspace. Further, the multiview analysis methods are applied to identify seizure-induced changes in different modalities of the autonomic nervous system in the brain and offers a possibility of a potential biomarker for epileptic seizure prediction.

## 13

Title: Reconstruction of Multivariate Sparse Signals from Mismatched Samples

Speaker: Taulant Koka, TU Darmstadt, Germany

Abstract: Erroneous correspondences between samples and their respective channel or target is a type of corruption that commonly arises in several real-world applications, such as whole-brain calcium imaging of freely moving organisms, the observation of insect flight and migration based on entomological radar, or multi-target tracking. We formalize the problem of reconstructing shuffled multi-channel signals that admit a sparse representation in a continuous domain and show that unique recovery is possible. We show that the problem is equivalent to a structured unlabeled sensing problem with sensing matrix estimation. Unfortunately, existing methods are neither robust to errors in the regressors nor do they exploit the structure of the problem. Therefore, we propose a novel robust two-step approach for the reconstruction of shuffled sparse signals. The proposed approach is evaluated on both synthetic and artificially shuffled real calcium imaging traces showing a significant performance gain as compared to existing methods.

## 14

Title: On optimal training statistics for neural decoders

Speaker: Meryem Benammar, ISAE-SUPAERO, France

Abstract: Channel coding and decoding, or error correction, is one of the key components of the information transmission chain in digital communication systems. Channel decoding often calls for very complex and time consuming processing at the receiver. Hence, alternative implementations based on neural networks have attracted a lot of attention in the past years in order to construct near optimal low complexity decoders. However, the choice of the meta parameters in the training phase is, to date, purely heuristic, especially when it comes to generating training datasets. In this talk we investigate the choice of the training dataset statistics and the effect on the training/validation mismatch on the performance of neural based decoders. We show, by means of a surrogate loss analysis, that there exists an optimal training statistic which provides for both tight performances and good generalization, and thus, minimizes the effect of the validation/training statistics mismatch. We illustrate the existence of such an optimal training statistic on two statistical channel models, Additive White Gaussian Noise (AWGN) and Binary Symmetric Channels (BSC), and show that neither totally noiseless nor totally noiseless training conditions are favorable.

## 15

Title: On the effect of dimension in nested important samplers for Bayesian inference

Speaker: Omar Fabián González Hernández, UC3M, Spain

Abstract: Many Bayesian inference problems involve high dimensional models for which only a subset of the model variables are actual estimation targets. All other variables, while they have to be taken into account in one way or the other, are just nuisance parameters that we would ideally like to integrate out analytically. Unfortunately, such integration is often impossible. However, there are several computational methods that have been proposed over the past 15 years that replace intractable analytical marginalisation by numerical integration —typically using different flavours of importance sampling. Such methods include particle Markov chain Monte Carlo, sequential Monte Carlo square (SMC2), nested particle filters and others. In this talk we discuss the role of the dimension of the nuisance parameters on the error bounds that can be attained when using nested importance samplers (including the SMC2 algorithm) for Bayesian inference in different probabilistic models, both static and dynamic. Specifically, we show how, under certain assumptions, it is possible to obtain approximation error bounds that hold uniformly over the dimension of the nuisance parameters, i.e., it is possible to guarantee that the Monte Carlo approximation error remains bounded when the dimension of the nuisance parameters increases without bound. We illustrate these theoretical results with some simple numerical simulations. This is joint work with Joaquín Miguez (Universidad Carlos III) and Víctor Elvira (University of Edinburgh).

## 16

Title: Multiple Hypothesis Testing for Spatial Inference in Sensor Networks

Speaker: Martin Gölz, TU Darmstadt, Germany

Abstract: The problem of identifying regions of spatially interesting, different or adversarial behavior is inherent to many practical applications involving distributed multisensor systems. In this talk, we discuss a recently developed general framework stemming from multiple hypothesis testing to identify such regions. The inference results are simple to interpret due to the guaranteed control of the false discovery rate at a pre-specified level. To reduce the intra-sensor-network communication overhead, the raw data is pre-processed at the sensors locally and a summary statistic is send to the cloud or fusion center where the actual spatial inference using multiple hypothesis testing and false discovery control takes place. Local false discovery rates (lfdrs) are estimated to express local believes in the state of the spatial signal. The method is agnostic to specific spatial propagation models of the underlying physical phenomenon.

## 17

Title: Robustness in Distributed Learning

Speaker: Stefan Vlaski, Imperial College, UK

Abstract: Distributed learning paradigms, such as federated and decentralized learning, allow a collection of intelligent agents to collaborate in solving a machine learning task while preserving scalability, communication efficiency, and privacy. Examples of such multi-agent systems include sensor networks, cloud-connected mobile devices such as phones, autonomous vehicles and social networks. Despite their advantages, distributed structures are susceptible to malicious behavior by a subset of agents. This observation motivates the need to develop algorithms for robust algorithms for distributed learning. This talk will describe some recent developments and open questions in robustness for distributed learning.

## 18

Title: From signals models to Riemannian geometry and application to machine learning

Speaker: Ammar Mian, USMB, France

Abstract: Many signal processing and machine learning tasks involve structured data points or structured parameters such as data on the sphere, covariance matrices, or principal subspaces. Riemannian geometry offers a formalism that can account for inherent constraints or invariances: endowing a manifold (smooth parameter space) with a metric induces a geometry, with corresponding tools such as geodesics and Riemannian distance. When assuming a statistical model, a Riemannian geometry can also naturally appear by endowing the parameter space with the Fisher information metric. This framework is referred to as information geometry. Interestingly, it yields a point of view that allows leveraging many tools from differential geometry for solving or analyzing statistical learning problems. Beyond model-based approaches, Riemannian geometry is also a prominent tool in data-driven methodologies such as metric learning and neural networks. This talk presents a general introduction to these concepts and their practical applications. The exposition will be divided into three parts:

  • A primer on Riemannian geometry for matrix manifolds and how to leverage these tools in Riemannian optimization problems. The main examples will be the space of symmetric positive definite matrices and subspaces (Grassmann manifold).

  • A focus on information geometry and its use in estimation problems related to multivariate Gaussian and elliptical distributions. Application examples will include robust mean and covariance matrix estimation, probabilistic PCA, and blind source separation.

  • An overview of clustering and classification methods based on Riemannian approaches (EM for Riemannian distributions, Riemannian K-means/nearest centroid, Riemannian neural network architectures such as SPDnet). Examples will include applications in remote sensing (satellite image time series analysis), image processing (pedestrian detection), and brain computer interface (EEG signals classification).