The Deep Learning (DL) technique has emerged exponentially in the field of medical imaging analysis. DL is playing a vital role in many clinical fields such as pathology [1], radiology [2], and other medical fields [3] which require the utmost specialization of humans. In medical imaging analysis, a limited set of images is always been a great concern for researchers. The recent techniques in DL depend upon the large dataset with annotated images. Learning from a limited set of annotated images is a great task in medical imaging analysis. Under these circumstances, transfer learning is said to be an effective technique. In transfer learning first, the pretraining of the model is implemented using source datasets [4] and then it is fine-tuned using the target dataset. This method is considered to be an impactful method when there is a small dataset of annotated images [5]. Usually, most of the pre-trained models are trained using natural images i.e., ImageNet dataset and they cannot perform well in medical imaging dataset. Medical images and natural images are different in many aspects like type, it’s comprehensive information, size, color, and dimensionality. According to the study [6], the pre-trained models that are trained on ImageNet are not good in terms of performance. For better performance of the model, an accurate and novel technique of transfer learning is the need for research. The medical imaging dataset is not readily available as other domain datasets due to the privacy concern of the patients and one of the major issues in this domain is the availability of the labeled dataset. But anyhow, a huge number of the unlabeled dataset from the different domain of medical images such as skin cancer dataset, brain tumor dataset, breast cancer dataset, COVID-19 dataset, and many more is available for research purpose. In this research work, a novel transfer learning technique will be investigated in which a proposed model will be pre-trained on the huge amount of un-annotated datasets and after the pre-training, the weights of the pre-trained model will be utilized for the fine-tuning of the domain dataset.
The datasets from ISIC can be used in this research for the pre-training purposes. The datasets of ISIC challenges from 2016-2020 can be used collectively in this research work during the self-supervised pre-training of the model. The dataset of 2016 ISIC challenge contains 4,314 images. The dataset for the next challenge of ISIC contains almost 6000 images. And the collective images from 2016-2020 ISIC challenge datasets are almost 88,314 images. Which is a huge amount of the images and can be used during self-supervised pretraining. The images can be collected from the repositories of skin cancer dataset, brain tumor datasets and breast cancer datasets.
To address the problem of data scarcity, in this research the aim is to use the self-supervised learning approach for the training of the model from the unlabeled dataset. Self-supervised learning is a new paradigm in the domain of medical imaging in which the model is pre-trained with a huge number of unannotated images and after the pre-training, the model is fine-tuned with the domain dataset. The general flow of the prosed method is demonstrated in Figure 1 below.
Figure 1: General Flow of the proposed methodology
Pre-training of the model will be carried out using a self-supervised learning approach. As the medical domain is supervision starved, so, self-supervised learning will be a good approach to explore the unannotated images to improve the efficiency and performance of the deep learning model. In this approach, supervisory signals are created from the data without labels and provide effective representation for downstream tasks. In this technique, we can pre-train our model on a large dataset without worrying about human-annotated labels, and convolutional networks are explicitly trained with the automatically generated dataset. By this approach, we can also label the data by finding the correlation between different input signals. In the pre-training phase, unlabeled data will be provided to the model and the model will learn generic representations of the data for the downstream task of brain tumor segmentation. After the pre-training of the model, the weights of the pre-trained model will be utilized as the initial weights of the proposed model. And instead of using training of the model from scratch, the weights of the pre-trained model will be used as a paradigm of transfer learning. Moreover, the techniques of SSL i.e., Contrastive Predictive Coding (CPC) and Relative Patch Location (RPL) can also be explored and implemented in the medical context. In CPC the model of encoder maps each patch of the input and then to maps it to the latent representation. After that, the context vectors are produced by summarizing the latent vectors. Finally the prediction for the next patches of the latent representations, are allowed. This prediction is cast as an N-way classification problem. In RPL, as learning of the semantic representation of data is required, spatial context in images is considered as a vital source. From each input image non-overlapped patches at the random location are sampled. A query patch is selected from the surroundings and the patch from the center of the grid is selected which will be used as a reference. The location of the query patch related to the location of the reference patch is used as a positive label. All other patches are used as a negative label. This cast the prediction as an N-1 way classification problem. So, different techniques of self-supervised learning can also be explored during the research work.
How to overcome Computational Resources
As self-supervision technique works better on huge amount of data so high computational power is required. Moreover, the learning of the model is on unlabeled data which is slower as compared to labeled data. For the given dataset, the labels are generated unconventionally which is an extra task and requires extra computational power. For this purpose more resources are required. The resources of the institute can be used or online resources like Google Colab pro can be used to meet the computational resources.
Evaluation Measures
1. Dice Score
2. Accuracy
3. F-Measure
4. Precision
5. Sensitivity
6. Specificity
7. False Negative Rate
8. False Positive Rate
9. Area under the curve (AUC)
[1] R. Valieris et al., “Deep Learning Predicts Underlying Features on Pathology Images with Therapeutic Relevance for Breast and Gastric Cancer,” Cancers (Basel)., vol. 12, p. 3687, 2020, doi: 10.3390/cancers12123687.
[2] R. Hamamoto et al., “cancers Application of Artificial Intelligence Technology in Oncology: Towards the Establishment of Precision Medicine,” Cancers (Basel)., vol. 12, p. 3532, 2020, doi: 10.3390/cancers12123532.
[3] T. Nazir et al., “Retinal Image Analysis for Diabetes-Based Eye Disease Detection Using Deep Learning”, doi: 10.3390/app10186185.
[4] Jia Deng, Wei Dong, R. Socher, Li-Jia Li, Kai Li, and Li Fei-Fei, “ImageNet: A large-scale hierarchical image database,” pp. 248–255, 2009, doi: 10.1109/cvprw.2009.5206848.
[5] L. Alzubaidi et al., “Novel transfer learning approach for medical imaging with limited labeled data,” Cancers (Basel)., vol. 13, no. 7, pp. 1–22, 2021, doi: 10.3390/cancers13071590.
[6] L. Alzubaidi et al., “Towards a better understanding of transfer learning for medical imaging: A case study,” Appl. Sci., vol. 10, no. 13, pp. 1–21, 2020, doi: 10.3390/app10134523.