Weakly Supervised CNN Segmentation: Models and Optimization

MICCAI 2020,

October, 8th

Scope

Deep convolutional neural networks (CNNs) are currently dominating semantic segmentation problems, yielding ground-breaking results when full-supervision is available, in a breadth of computer vision and medical imaging applications. A major limitation of such fully supervised models is that they require very large amounts of reliable training data, i.e., accurately and densely labeled (annotated) images built with extensive human labor and expertise. This is not feasible in many important problems and applications. In medical imaging, for instance, supervision of semantic segmentation requires scarce clinical-expert knowledge and labor-intensive, pixel-level annotations of a large number of images, a difficulty further compounded by the complexity of the data, e.g., 3D, multi-modal or temporal data. Typically, medical image annotations are available for relatively small data sets, and supervised learning models are seriously challenged with new samples that differ from the training data, for instance, due the changes in imaging protocols, clinical sites and subject populations. Despite the huge impact that deep learning has made recently in medical image analysis, current supervised learning models still have difficulty capturing the substantial variability encountered in real clinical contexts.

Weakly- and semi-supervised learning methods, which do not require full annotations and scale up to large problems and data sets, are currently attracting substantial research interest in both the CVPR and MICCAI communities. The general purpose of these methods is to mitigate the lack of annotations by leveraging unlabeled data with priors, either knowledge-driven (e.g., anatomy priors) or data-driven (e.g., domain adversarial priors). For instance, semi-supervision uses both labeled and unlabeled samples, weak supervision uses uncertain (noisy) labels, and domain adaptation attempts to generalize the representations learned by CNNs across different domains (e.g., different modalities or imaging protocols). In semantic segmentation, a large body of very recent works focused on training deep CNNs with very limited and/or weak annotations, for instance, scribbles, image level tags, bounding boxes, points, or annotations limited to a single domain of the task (e.g., a single imaging protocol). Several of these works showed that adding specific priors in the form of unsupervised loss terms can achieve outstanding performances, close to full-supervision results, but using only fractions of the ground-truth labels.

Description

This tutorial overviews very recent developments in weakly supervised CNN segmentation, which should be accessible and of high interest to the general MICCAI audience. More specifically, we will discuss several recent state-of-the-art models, and connect them from the perspective of imposing priors on the representations learned by deep networks. First, we will detail the loss functions driving these models, including both knowledge-driven functions (e.g., anatomy, shapes, or conditional random field losses) and data-driven functions (e.g., domain-adversarial losses). Then, we will discuss several possible optimization strategies for each of these losses, and emphasize the importance of optimization choice. While presenting the basic mathematical modelling and algorithmic ideas, our emphasis is on conceptual understanding of the presented methods so that tutorial attendees gain sufficient knowledge to use publicly available code and build their own models. We will demonstrate a number of practical applications and compare different methods. We will further present a case-study example, with hands-on implementation in PyTorch.

Learning objectives

This tutorial will enable participants to gain knowledge about state-of-the-art deep learning techniques for semantic segmentation when annotations are very limited and/or uncertain, including different scenarios that are of high interest in practice: semi-supervision, weak supervision and domain adaptation. While presenting the basic mathematical-modelling and loss-optimization ideas, our emphasis is on conceptual understanding of the presented techniques so that tutorial attendees gain sufficient knowledge to use publicly available code. We will also support the presentation of the methods with several medical-imaging applications and comparisons so that attendees will develop:

      • A good understanding of the different weak-supervision models (i.e., loss functions and priors) and the conceptual connections between them, with the ability to choose the most appropriate model for a given application scenario;

      • A good knowledge of several possible optimization strategies for each of the examined losses, with the ability to choose the most appropriate optimizer for a given problem or application scenario;

      • Building a clear understanding of the main strengths and weaknesses of several stat-of-the-art approaches and learning how to use them in several medical image segmentation problems.

      • Acquiring basic knowledge as to how to implement some of these solutions in a case-study example.

Organizers

Ismail Ben Ayed, Associate Professor at ETS Montreal.

Christian Desrosiers, Associate Professor at ETS Montreal.

Jose Dolz, Assistant Professor at ETS Montreal


Hoel Kervadec, PhD. ETS Montreal