Blog

Year: 2020/21 - 2019

MASSA Podcast: Mathematics of Data Science and Deep Learning

February 7, 2021

I had the pleasure to be a guest of the MASSA Podcast. This podcast is an excellent initiative of the Mathematics, Actuarial, and Statistics Student Association (aka MASSA) of Concordia University. I have been a regular listener of the podcast since its launch in Fall 2020. It's been a great way to stay connected with students and colleagues from the Department during this period of social distancing. I'm quite proud to be part of this series!

In the episode, we talk about the Mathematics of Data Science, Compressed Sensing, and Deep Learning. Check it out here!

Invariance, encodings, and generalization: learning identity effects with neural networks

January 21, 2021

Matthew Liu (Concordia), Paul Tupper (SFU) and I have just submitted a new paper!

We provide a theoretical framework in which we can rigorously prove that learning algorithms satisfying simple criteria are not able to generalize so-called identity effects outside the training set. Identity effects are used to determine whether two components of an object are identical or not and arise often in language and in cognitive science. We show that a broad class of learning algorithms including deep feedforward neural networks trained via gradient-based algorithms (such as SGD or Adam) satisfy our criteria, dependent on the encoding of inputs. In some broader circumstances we also provide adversarial examples able to "fool" the learning algorithm. Finally, we demonstrate our theory with computational experiments in which we explore the effect of different input encodings on the ability of algorithms to generalize to novel inputs. Our experiments can be reproduced using this code.

Learning model used to identify pairs of identical digits. The model is obtained by concatenating a computer vision (CV) model and an identity effect (IE) model. From right to left: The model takes a pair of images as input (right); then, the CV model classifies them or, equivalently, encodes them as two 10-dimensional probability vectors (center); finally, the IE model assigns a rating from 0 to 1 to the pair of probability (or encoding) vectors to identify whether the images represent identical digits or not (left). In our paper, we show that the ability of the IE model to recognize identical pairs improves when the CV model is undertrained and decreases when the CV model is optimally trained. This is in accordance with our theory.

Deep neural networks are effective at learning high-dimensional Hilbert-valued functions from limited data

December 15, 2020

The accurate approximation of scalar-valued functions from sample points is a key task in mathematical modeling and computational science. Recently, machine learning techniques based on Deep Neural Networks (DNNs) have begun to emerge as promising tools for function approximation in scientific computing problems. In our recent paper, Ben Adcock, Nick Dexter, Sebastian Moraga and I broaden this perspective by focusing on approximation of functions that take values in a Hilbert space. This problem arises in many science and engineering problems, in particular those involving the solution of parametric Partial Differential Equations (PDEs).

Our contributions are twofold:

  1. We present a novel result on DNN training for holomorphic functions with so-called hidden anisotropy. This result introduces a DNN training procedure and a full theoretical analysis with explicit guarantees on the error and sample complexity. Our result shows that there is a procedure for learning Hilbert-valued functions via DNNs that performs as well as current best-in-class schemes.

  2. We provide preliminary numerical results illustrating the practical performance of DNNs on Hilbert-valued functions arising as solutions to parametric PDEs. We consider different parameters, modify the DNN architecture to achieve better and competitive results and compare these to current best-in-class schemes.

Check out our paper here!

Learning the parameters-to-solution map of a diffusion equation with exponential parametric dependence from limited data. The figure shows approximations obtained using a tanh 5 Ă— 50 DNN after training for 2 epochs (top) and 2045 epochs (bottom) of Adam. The DNN solution is displayed in blue/red colors. A reference finite element solution is displayed in green/yellow colors. (Numerics by Nick Dexter and Sebastian Moraga. See Figure 2 of our paper for more details).

Upcoming course in Winter 2021: Sparsity and compressed sensing

September 30, 2020

In Winter 2021, I will be offering a graduate course Topics in Applied Mathematics: Sparsity and Compressed Sensing at Concordia. The course is also available as an ISM course. Although mainly intended for graduate students, it can be also taken by (brave and hard-working!) undergraduate students at Concordia. The course is cross-listed as MAST 837, MAST 680/4, C and MATH 494.

Course description:

Sparsity is a key principle in real-world applications such as image or audio compression, statistical data analysis, and scientific computing. Compressed sensing is the art of measuring sparse objects (like signals or functions) using the minimal amount of linear measurements. This course is an introduction to the mathematics behind these techniques: a wonderful combination of linear algebra, optimization, numerical analysis, probability, and harmonic analysis.

Topics covered include: l1 minimization, iterative and greedy algorithms, incoherence, restricted isometry analysis, uncertainty principles, random Gaussian sampling and random sampling from bounded orthonormal systems (e.g., partial Fourier measurements), applications to signal processing and computational mathematics.

A simple illustration that shows why minimizing the l1 norm (as opposed to the l2 norm) promotes sparse solutions to underdetermined linear systems.

Minisymposium: The mathematics of sparse recovery and machine learning

July 1st, 2020

Ben Adcock and I have co-organized a minisymposium on The Mathematics of Sparse Recovery and Machine Learning that will virtually take place on July 16 at the 2nd Joint SIAM/CAIMS Annual Meeting. This event was selected as part of the SIAG/CSE Special Minisyposium Track. We are really looking forward to it!

You can find more info here.

The benefits of acting locally

June 26, 2020

The sparsity in levels model has proved useful in numerous applications of compressed sensing, not least imaging. Reconstruction methods for sparse in levels signals usually rely on convex optimization, but little is known about iterative and greedy algorithms is this context. Ben Adcock (SFU), Matt King-Roskamp (SFU) and I bridge this gap in our new paper by showing new stable and robust uniform recovery guarantees for sparse in level variants of the iterative hard thresholding and the CoSaMP algorithms. Our theoretical analysis generalizes recovery guarantees currently available in the case of standard sparsity. We also propose and numerically test a novel extension of the orthogonal matching pursuit algorithm for sparse in levels signals.

Sparse recovery in bounded Riesz systems and numerical methods for PDEs

May 14, 2020

Sjoerd Dirksen (Universiteit Utrecht), Hans Christian Jung (RWTH Aachen), Holger Rauhut (RWTH Aachen), and I have just submitted a new paper. We study sparse recovery with structured random measurement matrices having independent, identically distributed, and uniformly bounded rows and with a nontrivial covariance structure. This class of matrices arises from random sampling of bounded Riesz systems and generalizes random partial Fourier matrices. Our main result improves the currently available results for the null space and restricted isometry properties of such random matrices. The main novelty of our analysis is a new upper bound for the expectation of the supremum of a Bernoulli process associated with a restricted isometry constant. We apply our result to prove new performance guarantees for the numerical approximation of PDEs via compressive sensing using the CORSING method.

Check out our preprint here!

When can neural networks learn identity effects? Accepted to CogSci2020!

May 9, 2020

A new paper in collaboration Matthew Liu (Concordia) and Paul Tupper (SFU) was accepted in the proceedings of CogSci 2020!

Motivated by problems in language and other areas of cognition, we investigate the ability of machine learning algorithms to learn identity effects, i.e. whether two components of an object are identical or not. We provide a simple framework in which we can rigorously prove that algorithms satisfying simple criteria cannot make the correct inference. We then show that a broad class of algorithms including deep neural networks with standard architecture and training with backpropagation satisfy our criteria, dependent on the encoding of inputs. We also demonstrate our theory with computational experiments in which we explore the effect of different input encodings on the ability of algorithms to generalize to novel inputs.

You can read our preprint here. A journal paper on this topic is currently in preparation.

3-active binary vs. Haar encoding. The ability of Neural Networks (NNs) to learn identity effects heavily relies on the encoding used to represent inputs. In the top row, a 3-active binary encoding is used and NNs are partially able to classify identity effects for previously unseen examples. In the bottom row, letters are encoded using the Haar encoding and NNs are unable to classify identity effects for new examples.

Compressive Isogeometric Analysis

March 17, 2020

Lorenzo Tamellini, Mattia Tani (CNR-IMATI Pavia), and I have just submitted a new paper.

Motivated by the difficulty with assembling the stiffness matrix in Isogemetric Analysis (IGA) when using splines of medium-to-high order, we propose a new methodology for solving PDEs over domains with a nontrivial geometry called Compressive Isogeometric Analysis. Through an extensive numerical illustration, we demonstrate that the proposed approach has the potential to mitigate this problem by assembling only a small fraction of the discretization matrix. This is possible thanks to a suitable combination of the IGA principle with CORSING, a recently introduced numerical technique for PDEs based on compressive sensing.

You can read our preprint here!

A multiscale solution to the 2D Poisson equation defined over a quarter of annulus.
A sparse approximation to the solution on the left plot computed using Compressive Isogeometric Analysis. Only 30% of the discretization matrix is assembled to compute it.