KiU-Net
Towards Accurate Segmentation of Biomedical Images using Over-complete Representations
MICCAI 2020
Jeya Maria Jose Vishwanath Sindagi Ilker Hacihaliloglu Vishal.M.Patel
Johns Hopkins University, Rutgers University
Abstract
Most methods for medical image segmentation use U-Net or its variants as they have been successful in most of the applications. After a detailed analysis of these “traditional” encoder-decoder based approaches, we observed that they perform poorly in detecting smaller structures and are unable to segment boundary regions precisely. This is in spite of the fact that these approaches propagate low-level features to the output through skip connections. This issue can be attributed to the increase in receptive field size as we go deeper into the encoder. The extra focus on learning high level features causes the U-Net based approaches to learn less information about low-level features which are crucial for detecting small structures. To overcome this issue, we propose using an overcomplete convolutional architecture where we project our input image into a higher dimension such that we constrain the receptive field from increasing in the deep layers of the network. We design a new architecture for image segmentation- KiU-Net which has two branches: (1) an overcomplete convolutional network Kite-Net which learns to capture fine details and accurate edges of the input, and (2) U-Net which learns high level features. We also propose its extension KiU-Net 3D which can be used for 3D volumetric segmentation.
Architecture
KiU-Net
KiU-Net 3D
3D Segmentation Comparison
U-Net 3D
KiU-Net 3D
Ground Truth
Idea : Constraining Receptive Field by Overcomplete Representations
In a generic "encoder-decoder" architecture , the initial few blocks of the encoder learn low-level features of the data while the later blocks learn the high-level features. Eventually, the encoder learns to map the data to lower dimensionality (in the spatial sense). The increasing receptive field size over the depth of the network, constrains the network to focus more on the higher-level features. We propose using overcomplete representations where we constraint the receptive field from increasing. This is done by a simple change in the architecture of encoder where max-pooling is replaced by up-sampling. This helps the filters in deep layers focus more on the low-level details helping in fine segmentation.
Undercomplete Representations (ex: U-Net)
Increasing Receptive Field
OverComplete Representations (Kite-Net)
Constrained Receptive Field
Feature Map Visualization
U-Net
Ki-Net
Sample Qualitative Results
Some qualitative results on Ultrasound Brain Ventricle Segmentation of Preterm Neonates , GLAnd Segmentation (GLAS) datatset and RITE (Retinal Images vessel Tree Extraction). Please refer the main paper for quantitative results.
Citation:
@inproceedings{valanarasu2020kiu,
title={KiU-Net: Towards Accurate Segmentation of Biomedical Images Using Over-Complete Representations},
author={Valanarasu, Jeya Maria Jose and Sindagi, Vishwanath A and Hacihaliloglu, Ilker and Patel, Vishal M},
booktitle={Medical Image Computing and Computer Assisted Intervention--MICCAI 2020: 23rd International Conference, Lima, Peru, October 4--8, 2020, Proceedings, Part IV 23},
pages={363--373},
year={2020},
organization={Springer}
}
@article{valanarasu2020kiu,
title={KiU-Net: Overcomplete Convolutional Architectures for Biomedical Image and Volumetric Segmentation},
author={Valanarasu, Jeya Maria Jose and Sindagi, Vishwanath A and Hacihaliloglu, Ilker and Patel, Vishal M},
journal={arXiv preprint arXiv:2010.01663},
year={2020}
}