Research

My research interest lies in the field of robotics and automation, particularly in perception, planning and robotic manipulation. I love finding new ways of integrating computer vision and artificial intelligence in everyday life, in the field of biomedical imaging and in robotics.

Presentation on the project

S2C-DeLeNet: A Parameter Transfer Based Segmentation-Classification Integration for Detecting Skin Cancer Lesions from Dermoscopic Images

Research Project

Dermoscopic images ideally depict pigmentation attributes on the skin surface which is highly regarded in the medical community for detection of skin abnormality, disease or even cancer. The identification of such abnormality, however, requires trained eyes and accurate detection necessitates the process being time-intensive. As such, computerized detection schemes have become quite an essential, especially schemes which adopt deep learning tactics. In this paper, a convolutional deep neural network, S2C-DeLeNet, is proposed, which (i) Performs segmentation procedure of lesion based regions with respect to the unaffected skin tissue from dermoscopic images using a segmentation sub-network, (ii) Classifies each image based on its medical condition type utilizing transferred parameters from the inherent segmentation sub-network

Find the paper in Journal: Computers in Biology and Medicine, Elsevier

Detecting blocks in blood vessels

A Sequence Agnostic Multimodal Fusion Preprocessing Scheme for Classification Network in Clogged Blood Vessel Detection in Alzheimer's Diagnosis

Research Competition

Reduced blood flow due to the blocking of blood vessels is a significant pathological feature of the brain affected by Alzheimer’s disease. These blocks are identified from the Two Photon Excitation Microscopy (TPEF) of the brain which contains spatial and depth-time variable image samples of the vessel structures. In this study, we propose preprocessing techniques on such data to help identify stalled or non-stalled brain capillaries. In our work, we proposed to use processed image data and point cloud data on two separate streams for different state-of-the-art video classification networks. Since the two streams have different modalities and contain complementary information, an early fusion of the two streams allows the combined model to achieve better performance in stalled and non-stalled vessels classification. Our experimental results on The Clog Loss dataset show that our proposed technique consistently improves the performance of all the baseline methods.

Submitted in ICASSP 2023.

Our presentation in the finals!

Vehicle Detection and Tracking from Fisheye Images at Intersections

The dataset provided for our task included traffic images at intersections taken using fisheye cameras. The first task is to detect the vehicles present in the frame. The second task is to track the detected vehicles and obtain their trajectory. There are a few challenges associated with the dataset - fisheye camera causes spherical distortion, overhead camera positioning causes various orientations. The dataset also contained samples for different times of the day - namely day and night. In our detection framework, we first identified the frames as day or night using the SqueezeNet model and then model weights for the appropriate light conditions are selected. We have used two state of the art models named UniverseNet and YOLOV5 in parallel for detecting the vehicles. After the concatenation of detections by the two models, we applied non-max suppression (NMS) to generate final detections. For tracking the detected vehicles, we used the SORT algorithm.

An overview of the work

Unsupervised Anomaly Detection in Intelligent and Heterogeneous Autonomous Systems

Anomaly detection from drone flight data is a challenging task. In this work we have proposed an ensemble of classical and neural network based approaches to automatically localize and quantify anomalies from drone flight data in both time and sensor domain. Our experiments showed that using the IMU sensor values is the most efficient way for this task.

GitHub link: https://github.com/ClockWorkKid/SPCUP2020-BUET-Synapticans

Detection framework

BanglaNum - A Novel Dataset for Bengali Digit Recognition from Speech

Digital Signal Processing Laboratory Project

Automatic speech recognition (ASR) converts the human voice into readily understandable and categorized text or words. Although Bengali is one of the most widely spoken languages in the world, there have been very few studies on Bengali ASR, particularly on Bangladeshi-accented Bengali. In this study, audio recordings of spoken digits (0-9) from university students were used to create a Bengali speech digits dataset that may be employed to train artificial neural networks for voice-based digital input systems. This paper also compares the Bengali digit recognition accuracy of several Convolutional Neural Networks (CNNs) using spectrograms and shows that a test accuracy of 98.23% is achievable using parameter-efficient models such as SqueezeNet on our dataset.

Submitted in ICECE 2022. The dataset is available at Kaggle: https://www.kaggle.com/datasets/mirsayeed/banglanum-bengali-number-recognition-from-voice