My research interests include Computer Vision, Machine/Deep Learning, Medical Imaging, and Image/Video processing. I solve problems related to healthcare, visual surveillance, traffic monitoring, automatic vehicle navigation, etc.
Following are the highlights of my current and previous research works:
Medical Imaging:
The Unet has become the golden standard method for segmentation of 2D medical images that any new method must be validated against. However, in recent years, a number of variations to the seminal Unet has been proposed with promising results in the papers introducing them. However, there is no clear consensus if any of these architectures generalize as well and the Unet currently remains the methodological golden standard. The purpose of this study was to evaluate some of the most promising Unet-inspired architectures for the task of 3D segmentation. For segmentation of 3D scans, Unet-inspired methods are also dominant, but there is a larger variety across applications. By evaluating the architectures in a different dimensionality, embedded in a different method, and for a different task, we aimed to evaluate if any of these Unet-alternatives are promising as a new golden standard that generalizes even better than the Unet. Specifically, we investigated the architectures as the central 2D segmentation core in the Multi-Planar Unet 3D segmentation method that previously demonstrated excellent generalization in the MICCAI Segmentation Decathlon. It would strongly support a claim of generalizability, if a promising Unet-variant consistently outperforms the Unet in this quite different setting. For this purpose, we evaluated four architectures for segmentation of cartilage from three different cohorts with knee MRIs. The implementation of our work is available at link.
Visual Object Tracking:
Developing a robust object tracker is a challenging task due to factors such as occlusion, motion blur, fast motion, illumination variations, rotation, background clutter, low resolution, and deformation across the frames. In the literature, lots of good approaches based on sparse representation have already been presented to tackle the above problems. However, most of the algorithms do not focus on the learning of sparse representation. They only consider the modeling of target appearance and therefore drift away from the target with the imprecise training samples. By considering all the above factors in mind, we have proposed a visual object tracking algorithm by integrating a coarse-to-fine search strategy-based sparse representation and the weighted multiple instance learning (WMIL) algorithm. Compared with the other trackers, our approach has more information of the original signal with less complexity due to the coarse-to-fine search method, and also has weights for important samples. Thus, it can easily discriminate the background features from the foreground. Furthermore, we have also selected the samples from the un-occluded sub-regions to efficiently develop a strong classifier. As a consequence, a stable and robust object tracker is achieved to tackle all the aforementioned problems. Experimental results with quantitative as well as qualitative analysis on challenging benchmark dataset show the accuracy and efficiency of our method.
Object Detection:
Detecting moving objects from video frame sequences has a lot of useful applications in computer vision. This proposed method of moving object detection first estimates the bi-directional optical flow fields between (i) the current frame and the previous frame and between (ii) the current frame and the next frame. The bi-directional optical flow field is then subjected to normalization and enhancement. Each normalized and enhanced optical flow field is then divided into non-overlapping blocks. The moving objects are finally detected in the form of binary blobs by examining the histogram-based thresholded values of such optical flow field of each block as well as the optical flow field of the candidate flow value. Our technique has been conceptualized, implemented, and tested on real video data sets with complex background environments. The experimental results and quantitative evaluation establish that our technique achieves effective and efficient results than other existing methods.
Video Compression:
Video surveillance is one of the widely used and most active research applications of computer vision. Although lots of works have been done in the area of smart surveillance, but still there is a need of effective compression technique for compact archival and efficient transmission of vast amount of surveillance video data. In this work, we propose a hybrid video compression approach with the help of foreground motion compensation for the above application. This method works effectively by including the advantages of both block-based and object-based coding techniques as well as reducing the drawbacks of both. The proposed method first segments the foreground moving objects from the background with the help of adaptive thresholding-based optical flow techniques. Next, it determines the contour of the segmented foreground regions with the help of Freeman chain code. Subsequently, block-based motion estimation and compensation using variants of particle swarm optimization are computed. After that, motion failure areas are detected using change detection method, and finally, DCT and Huffman coding-based entropy encoding are done to compactly represent the data. Experimental results and analyses on different surveillance video sequences using Wilcoxon’s rank-sum test, PSNR and SSID show that our method outperforms other recent and relevant existing techniques.