Research 

Keyframe Selection from Colonoscopy Videos for Enhanced Polyp Detection

Colonoscopy video acquisition has been tremendously increased for retrospective analysis, comprehensive inspection, and detection of polyps to diagnose colorectal cancer (CRC). However, extracting meaningful clinical information from colonoscopy videos requires an enormous amount of reviewing time, which burdens the surgeons considerably. To reduce the manual efforts, we propose a first end-to-end automated multi-stage deep learning framework to extract an adequate number of clinically significant frames, i.e., keyframes from colonoscopy videos.

The proposed framework comprises multiple stages that employ different deep learning models to select keyframes,  which are high-quality, non-redundant polyp frames capturing multi-views of polyps. In one of the stages of our framework, we also propose a novel multi-scale attention-based model, YcOLOn, for polyp localization, which generates ROI and prediction scores crucial for obtaining keyframes. We further designed a GUI application to navigate through different stages.

Extensive evaluation in real-world scenarios involving patient-wise and cross-dataset validations shows the efficacy of the proposed approach. The framework removes 96.3% and 94.02% frames, reduces detection processing time by 38.28% and 59.99%, and increases mAP by 2% and 5% on the SUN database and the CVC-VideoClinicDB, respectively. The source code is available at https://github.com/Vanshali/KeyframeExtraction.

Link to paper: https://ieeexplore.ieee.org/abstract/document/10268934 


GastroVision: A Multi-class Endoscopy Image Dataset for Computer Aided Gastrointestinal Disease Detection

Integrating real-time artificial intelligence (AI) systems in clinical practices faces challenges such as scalability and acceptance. These challenges include data availability, biased outcomes, data quality, lack of transparency, and underperformance on unseen datasets from different distributions. The scarcity of large-scale, precisely labeled, and diverse datasets are the major challenge for clinical integration. This scarcity is also due to the legal restrictions and extensive manual efforts required for accurate annotations from clinicians. To address these challenges, we present GastroVision, a multi-center open-access gastrointestinal (GI) endoscopy dataset that includes different anatomical landmarks, pathological abnormalities, polyp removal cases and normal findings (a total of 27 classes) from the GI tract. The dataset comprises 8,000 images acquired from  Baerum Hospital in Norway and Karolinska University Hospital in Sweden and was annotated and verified by experienced GI endoscopists. Furthermore, we validate the significance of our dataset with extensive benchmarking based on the popular deep learning based baseline models. We believe our dataset can facilitate the development of AI-based algorithms for GI disease detection and classification. Our dataset is available at https://osf.io/84e7f/

Link to paper: https://link.springer.com/chapter/10.1007/978-3-031-47679-2_10 

arXiv: https://arxiv.org/abs/2307.08140 


Can Adversarial Networks Make Uninformative Colonoscopy Video Frames Clinically Informative?

AAAI_Poster_VanshaliSharma.pdf

A DWT-based Encoder-Decoder Network for Specularity Segmentation in Colonoscopy Images

Specularity segmentation in colonoscopy images is a crucial pre-processing step for efficient computational diagnosis. The presence of these specular highlights could mislead the detectors that are intended towards precise identification of biomarkers. Conventional methods adopted so far do not provide satisfactory results, especially in the overexposed regions. The potential of deep learning methods is still unexplored in the related problem domain. Our work aims at providing a solution for more accurate highlights segmentation to assist surgeons. In this paper, we propose a novel deep learning based approach that performs segmentation following a multi-resolution analysis. This is achieved by introducing discrete wavelet transform (DWT) into the proposed model. We replace the standard pooling layers with DWTs, which helps preserve information and circumvent the effect of overexposed regions. All analytical experiments are performed using a publicly available benchmark dataset, and an F1-score (%) of 83.10 is obtained on the test set. The experimental results show that this technique outperforms state-of-the-art methods and performs significantly better in overexposed regions. The proposed model also performed superior to some deep learning models (but applied in different domains) when tested with our problem specifications. Our method provides segmentation outcomes that are  closer to the actual segmentation done by experts. This ensures improved pre-processed colonoscopy images that aid in better diagnosis of colorectal cancer.

Link to paper: https://link.springer.com/article/10.1007/s11042-023-14564-1 


Lymph node detection in CT scans using modified U-Net with residual learning and 3D deep network

Lymph node (LN) detection is a crucial step that complements the diagnosis and treatments involved during cancer investigations. However, the low-contrast structures in the CT scan images and the nodes’ varied shapes, sizes, and poses, along with their sparsely distributed locations, make the detection step challenging and lead to many false positives. To overcome these issues, our work aims at providing an automated framework for LNs detection in order to obtain more accurate detection results with low false positives. Methods The proposed work consists of two stages: candidate generation and false positive reduction. The first stage generates volumes of interest (VOI) of probable LN candidates using a modified U-Net with ResNet architecture to obtain high sensitivity but with the cost of increased false positives. The second stage processes the obtained candidate LNs for false positive reduction using a 3D convolutional neural network (CNN) classifier. Our proposed approach yields sensitivities of 87% at 2.75 false positives per volume (FP/vol.) and 79% at 1.74 FP/vol. with the mediastinal and abdominal datasets, respectively. 

Link to paper: https://link.springer.com/article/10.1007/s11548-022-02822-w 

TransNetR: Transformer-based Residual Network for Polyp Segmentation with  Multi-center Out-of-Distribution Testing

TransNetR_Poster.pdf