Aviad Aberdam

I am a PhD student at the Electrical Engineering department of the Technion, advised by Prof. Michael Elad. In 2020, I interned at Amazon where I worked with Prof. Pietro Perona, and in 2019, my internship was in BlackRock. Prior to that, I received my B.Sc. degree from the EE department of the Technion (2017).

My research interests include machine learning, computer vision, inverse problems and optimization. In particular, I am interested in the following subjects:

  • Theoretically analyzing deep learning architectures and their stability to adversarial noise, utilizing sparse coding tools and leading to empirical algorithm developments [1, 2, 3].

  • Proposing theoretical justified algorithms for inverting deep generative models [4].

  • Developing learned solver schemes that can handle varying models [5].

  • Deriving coordinate gradient descent algorithms for non-separable composite functions.

  • Presenting image morphing algorithms that combines the optimal transport technique with generative models (GANs) [6].

I received the 2020-2022 Azrieli fellowship, 2019 Fine fellowship, and 2017-2018 Meyer fellowship.

Contact Information:

Email: aaberdam@gmail.com

Office: Taub 400, Technion

Google Scholar
Google Scholar

Selected Publications

Aviad Aberdam*, Ron Litman*, Shahar Tsiper, Oron Anschel, Ron Slossberg, Shai Mazor, R. Manmatha, and Pietro Perona

We propose a framework for sequence-to-sequence contrastive learning (SeqCLR) of visual representations, which we apply to text recognition. To account for the sequence-to-sequence structure, each feature map is divided into different instances over which the contrastive loss is computed. This operation enables us to contrast in a sub-word level, where from each image we extract several positive pairs and multiple negative examples. To yield effective visual representations for text recognition, we further suggest novel augmentation heuristics, different encoder architectures and custom projection heads. Experiments on handwritten text and on scene text show that when a text decoder is trained on the learned representations, our method outperforms non-sequential contrastive methods. In addition, when the amount of supervision is reduced, SeqCLR significantly improves performance compared with supervised training, and when fine-tuned with 100% of the labels, our method achieves state-of-the-art results on standard handwritten text recognition benchmarks.

Aviad Aberdam*, Dror Simon*, and Michael Elad

Deep generative models (e.g. GANs and VAEs) have been developed quite extensively in recent years. Lately, there has been an increased interest in the inversion of such a model, i.e. given a (possibly corrupted) signal, we wish to recover the latent vector that generated it. Building upon sparse representation theory, we define conditions that are applicable to any inversion algorithm (gradient descent, deep encoder, etc.), under which such generative models are invertible with a unique solution. Importantly, the proposed analysis is applicable to any trained model, and does not depend on Gaussian i.i.d. weights. Furthermore, we introduce two layer-wise inversion pursuit algorithms for trained generative networks of arbitrary depth, and accompany these with recovery guarantees. Finally, we validate our theoretical results numerically and show that our method outperforms gradient descent when inverting such generators, both for clean and corrupted signals.

Aviad Aberdam, Alona Golts, and Michael Elad

Neural networks that are based on unfolding of an iterative solver, such as LISTA (learned iterative soft threshold algorithm), are widely used due to their accelerated performance. Nevertheless, as opposed to non-learned solvers, these networks are trained on a certain dictionary, and therefore they are inapplicable for varying model scenarios. This work introduces an adaptive learned solver, termed Ada-LISTA, which receives pairs of signals and their corresponding dictionaries as inputs, and learns a universal architecture to serve them all. We prove that this scheme is guaranteed to solve sparse coding in linear rate for varying models, including dictionary perturbations and permutations. We also provide an extensive numerical study demonstrating its practical adaptation capabilities. Finally, we deploy Ada-LISTA to natural image inpainting, where the patch-masks vary spatially, thus requiring such an adaptation.

Dror Simon and Aviad Aberdam. Published in the Conference on Computer Vision and Pattern Recognition (CVPR), 2020.

Image interpolation, or image morphing, refers to a visual transition between two (or more) input images. For such a transition to look visually appealing, its desirable properties are (i) to be smooth; (ii) to apply the minimal required change in the image; and (iii) to seem “real”, avoiding unnatural artifacts in each image in the transition. To obtain a smooth and straightforward transition, one may adopt the well-known Wasserstein Barycenter Problem (WBP). While this approach guarantees minimal changes under the Wasserstein metric, the resulting images might seem unnatural. In this work, we propose a novel approach for image morphing that possesses all three desired properties. To this end, we define a constrained variant of the WBP that enforces the intermediate images to satisfy an image prior. We describe an algorithm that solves this problem and demonstrate it using the sparse prior and generative adversarial networks.

Jeremias Sulam*, Aviad Aberdam*, Amir Beck, and Michael Elad. Published in IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2020.

Parsimonious representations are ubiquitous in modeling and processing information. Motivated by the recent Multi-Layer Convolutional Sparse Coding (ML-CSC) model, we herein generalize the traditional Basis Pursuit problem to a multi-layer setting, introducing similar sparse enforcing penalties at different representation layers in a symbiotic relation between synthesis and analysis sparse priors. We explore different iterative methods to solve this new problem in practice, and we propose a new Multi-Layer Iterative Soft Thresholding Algorithm (ML-ISTA), as well as a fast version (ML-FISTA). We show that these nested first order algorithms converge, in the sense that the function value of near-fixed points can get arbitrarily close to the solution of the original problem. We further show how these algorithms effectively implement particular recurrent convolutional neural networks (CNNs) that generalize feed-forward ones without introducing any parameters. We present and analyze different architectures resulting unfolding the iterations of the proposed pursuit algorithms, including a new Learned ML-ISTA, providing a principled way to construct deep recurrent CNNs. Unlike other similar constructions, these architectures unfold a global pursuit holistically for the entire network. We demonstrate the emerging constructions in a supervised learning setting, consistently improving the performance of classical CNNs while maintaining the number of parameters constant.

Yaniv Romano*, Aviad Aberdam*, Jeremias Sulam, and Michael Elad. Published in Journal of Mathematical Imaging and Vision (Special issue on Mathematical Foundations of Deep Learning in Imaging Sciences), 2019.

Despite their impressive performance, deep convolutional neural networks (CNNs) have been shown to be sensitive to small adversarial perturbations. These nuisances, which one can barely notice, are powerful enough to fool sophisticated and well performing classifiers, leading to ridiculous misclassification results. In this paper we analyze the stability of state-of-the-art deep-learning classification machines to adversarial perturbations, where we assume that the signals belong to the (possibly multi-layer) sparse representation model. We start with convolutional sparsity and then proceed to its multi-layered version, which is tightly connected to CNNs. Our analysis links between the stability of the classification to noise and the underlying structure of the signal, quantified by the sparsity of its representation under a fixed dictionary. In addition, we offer similar stability theorems for two practical pursuit algorithms, which are posed as two different deep-learning architectures - the layered Thresholding and the layered Basis Pursuit. Our analysis establishes the better robustness of the later to adversarial attacks. We corroborate these theoretical results by numerical experiments on three datasets: MNIST, CIFAR-10 and CIFAR-100.

Aviad Aberdam, Jeremias Sulam, and Michael Elad. Published in SIAM Journal on Mathematics of Data Science (SIMODS), 2018.

The recently proposed multi-layer sparse model has raised insightful connections between sparse representations and convolutional neural networks (CNN). In its original conception, this model was restricted to a cascade of convolutional synthesis representations. In this paper, we start by addressing a more general model, revealing interesting ties to fully connected networks. We then show that this multi-layer construction admits a brand new interpretation in a unique symbiosis between synthesis and analysis models: while the deepest layer indeed provides a synthesis representation, the mid-layers decompositions provide an analysis counterpart. This new perspective exposes the suboptimality of previously proposed pursuit approaches, as they do not fully leverage all the information comprised in the model constraints. Armed with this understanding, we address fundamental theoretical issues, revisiting previous analysis and expanding it. Motivated by the limitations of previous algorithms, we then propose an integrated - holistic - alternative that estimates all representations in the model simultaneously, and analyze all these different schemes under stochastic noise assumptions. Inspired by the synthesis-analysis duality, we further present a Holistic Pursuit algorithm, which alternates between synthesis and analysis sparse coding steps, eventually solving for the entire model as a whole, with provable improved performance. Finally, we present numerical results that demonstrate the practical advantages of our approach.