Ask Sohini,

Questions Answered from AI with Sohini

In this column, I will respond to the commonly asked questions at my Youtube Channel, Linkedin and Instagram

Q3. [March 26, 2021] [Video: Image Segmentation with U-net

I have to present Journal Reading about segmentation optic cup and optic disc using cascade network. Can you describe what is the meaning of cascade network? This is something new for me. I'm from Indonesia. I didn't find any references about U-Net in Indonesia.

Ans. Thanks for your question. As the word "Cascade" suggests, it is the combination of multiple U-nets. The combination can either be a linear/serial combination as seen in Fig 1. below, or it can be a single encoder and multiple decoders for multi-class semantic segmentation as shown in Fig 2. In both cases, the Loss function is a weighted pixel-wise cross entropy loss.

Fig 1: Serial connections between U-net used as cascaded U-net in Source: https://arxiv.org/pdf/1907.07677.pdf

Fig 2: Single encoder, multiple decoder branches for cascaded U-net Source: https://sh-tsang.medium.com/review-coarse-to-fine-3d-u-net-multi-organ-segmentation-biomedical-image-segmentation-37a419fb963b

To understand cascaded U-net, consider the situation where ONE U-net is NOT enough to perform the granular level segmentation in specialized images such as medical images. So, the first U-net segments selects a relative neighborhood of the pathology and this image is then fed as input to the next U-net that now detects the region of interest accurately as shown in Fig 3 below.

Fig 3: Source: https://sh-tsang.medium.com/review-coarse-to-fine-3d-u-net-multi-organ-segmentation-biomedical-image-segmentation-37a419fb963b.

Left Image is the output of U-net 1 where all regions within the red boundary are detected. This image is then fed to U-net 2 that further segments the pathological regions (blue and green) accurately.

Q2. [March, 24, 2021] [Video: Unet for Multiclass Segmentation]

Explain the impact of using a deep CNN like VGG or ResNet as backbone for U_net for either binary or multi class segmentation.

Ans: U-net is one of the most popular encoder-decoder combinations wherein the input can be an image and the output is also an image. The encoder channel (left arm of the U-net below) extracts the best local and global features, while the decoder (right arm of the U-net) up-pools the features back to the image space. So the key concept for U-net is that input and output has similar dimensions (in the same domain). Typically the U-net is a relatively shallow CNN model with 4 encoder-decoder layers and skip connections between each encoder-decoder level. This allows for short and long term skip connections. However, it is completely possible to use deeper CNN models to encode features followed by decoder layers to to bring the features back to image plain. For example see the paper "Depthwise Separable Convolutional Neural Network Model for Intra-Retinal Cyst Segmentation", where the goal is to perform binary segmentation of cysts in optical coherence tomography (OCT) images using depthwise separable kernels that reduce computational complexity.

For other deeper encoder-decoder combinations for segmentation visit the Github site (https://github.com/qubvel/segmentation_models) that provides code and examples to apply Unet, PSPnet, Linknet, FPN etc for semantic segmentation. For multi-class segmentation, the only change is the way the labels (Y) are prepared, with binary maps per class per image plane as explained in the video.

Thanks and stay tuned!

Q1. [March, 23, 2021] [Video: Unet for Multiclass Segmentation]

Kindly discuss the recent trends, repos, skills expected and impact of AI in biomedical engineering/applications for Beginners.

Ans: I have now created a Data and Research Samples page that houses freely available public domain data sets. That should help any beginners get started.

For skills, data modeling should be envisioned as a 3 step process 1) Data Processing, 2) Model and parameters 3) Metrics and feedback. For any problem, these three aspects should be taken care of in order of importance.

For beginners, my advice is always to start with Kaggle data sets. The major reason here is that the challenges such as https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge

has code submitted by others, https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge/code

So starting with the benchmark codes can help build confidence and also understand recent trends and connect with the ML/AI community. The biomedical field has a plethora of problems under diagnostic medicine, prognostics and personalized medicine that still need to be solved.

Thanks and stay tuned!