KiU-Net

Towards Accurate Segmentation of Biomedical Images using Over-complete Representations

MICCAI 2020

Jeya Maria Jose Vishwanath Sindagi Ilker Hacihaliloglu Vishal.M.Patel

Johns Hopkins University, Rutgers University

[Conference Paper] [Code] [Journal Extension] [Slides]

Abstract

Most methods for medical image segmentation use U-Net or its variants as they have been successful in most of the applications. After a detailed analysis of these “traditional” encoder-decoder based approaches, we observed that they perform poorly in detecting smaller structures and are unable to segment boundary regions precisely. This is in spite of the fact that these approaches propagate low-level features to the output through skip connections. This issue can be attributed to the increase in receptive field size as we go deeper into the encoder. The extra focus on learning high level features causes the U-Net based approaches to learn less information about low-level features which are crucial for detecting small structures. To overcome this issue, we propose using an overcomplete convolutional architecture where we project our input image into a higher dimension such that we constrain the receptive field from increasing in the deep layers of the network. We design a new architecture for image segmentation- KiU-Net which has two branches: (1) an overcomplete convolutional network Kite-Net which learns to capture fine details and accurate edges of the input, and (2) U-Net which learns high level features. We also propose its extension KiU-Net 3D which can be used for 3D volumetric segmentation.

Architecture

KiU-Net

KiU-Net 3D

3D Segmentation Comparison

U-Net 3D

KiU-Net 3D

Ground Truth

Idea : Constraining Receptive Field by Overcomplete Representations

In a generic "encoder-decoder" architecture , the initial few blocks of the encoder learn low-level features of the data while the later blocks learn the high-level features. Eventually, the encoder learns to map the data to lower dimensionality (in the spatial sense). The increasing receptive field size over the depth of the network, constrains the network to focus more on the higher-level features. We propose using overcomplete representations where we constraint the receptive field from increasing. This is done by a simple change in the architecture of encoder where max-pooling is replaced by up-sampling. This helps the filters in deep layers focus more on the low-level details helping in fine segmentation.

Undercomplete Representations (ex: U-Net)

Increasing Receptive Field

OverComplete Representations (Kite-Net)

Constrained Receptive Field

Feature Map Visualization

U-Net

Ki-Net

Sample Qualitative Results

Some qualitative results on Ultrasound Brain Ventricle Segmentation of Preterm Neonates , GLAnd Segmentation (GLAS) datatset and RITE (Retinal Images vessel Tree Extraction). Please refer the main paper for quantitative results.

Citation:

@inproceedings{valanarasu2020kiu,

title={KiU-Net: Towards Accurate Segmentation of Biomedical Images Using Over-Complete Representations},

author={Valanarasu, Jeya Maria Jose and Sindagi, Vishwanath A and Hacihaliloglu, Ilker and Patel, Vishal M},

booktitle={Medical Image Computing and Computer Assisted Intervention--MICCAI 2020: 23rd International Conference, Lima, Peru, October 4--8, 2020, Proceedings, Part IV 23},

pages={363--373},

year={2020},

organization={Springer}

}

@article{valanarasu2020kiu,

title={KiU-Net: Overcomplete Convolutional Architectures for Biomedical Image and Volumetric Segmentation},

author={Valanarasu, Jeya Maria Jose and Sindagi, Vishwanath A and Hacihaliloglu, Ilker and Patel, Vishal M},

journal={arXiv preprint arXiv:2010.01663},

year={2020}

}